᾿ 


a HVAT ΗΝ 
ake 


CALCULUS 


Michael Spivak 


εἶ 


ye 9ΠΊΠΟΊΨΥ 


mse 


: rr 
} 
τ a 
= 
τ ε 
ra 
ie 7 ‘os 


| Be niamin 
ἘΞ SOULS jj 


AQTS 
SPIVAK Μ. 
θα εωμα. 
ς ὅς ΟἽ | 
ΝΑΜΕ 


freee ee ee eS 5. ὅ: μι Βι Βι βε 
Perro Ts : 2. 

Fee 
eS EES SESS ETT 
PP PPPS ESSER RE SS EE EER SSE κακπακπακκακαα 5 


πὶ ΠΝ τ ΠΝ ἧι δ, ets RAPS DHSS PHS τα ἐξ νἷν εν ἥν. ἐν ἂν ἐν ἂν ἀξ τν ὅν ἔτ ταν τὧν ἥν, πε "Ν' πε νι 1" Ν ΜῈ ΜΒ ΠῈ ΔΝ ΜΠ ΝΕ 


ὁ $407 


W. A. Benjamin, Inc. 
LONDON 


MENLO PARK, CALIFORNIA 


Michael ϑρινακ 


BRANDEIS UNIVERSITY 


SNe 


= 


4 


[Ὁ 


Sepies 


ΤΙ ΤΩ 


WORLD STUDENT SERIES EDITION 


A complete and unabridged reprint of the original 
American textbook, this World Student Sertes edition 
may be sold only in those countries to which ut ts 
consigned by Addison-Wesley or its authorized trade 
distributors. It may not be re-exported from the country 
to which it has been consigned, and it may not be sold in 
the United States of America or its possesstons. 


The publisher is pleased to acknowledge the assistance 
of Wladislaw Finne, who destgned the text and cover, 
and F. W. Taylor, who produced the illustrations. 


Copyright © 1967 by W. A. Benjamin, Inc. All rights 
reserved. No part of this publication may be reproduced, 
stored in a retrieval system, or transmilted, in any form 

or by any means, electronic, mechanical, photocopying, 
recording, or otherwise, without the prior written 
permission of the publisher. Original edition published 

in the United States of America. Published stmultaneously 
in Canada. Philippines copyright 1967. 


Library of Congress Catalog Card Number: 67—20770 


Dedicated to the Memory of Y. P: 


PREFACE 


I hold every man a debtor 
to his profession, 
from the which as men of course 
doe seeke to receive countenance and profit, 
so ought they of duty to endeavour 
themselves by way of amends, 
to be a help and 
ornament thereunto. 
FRANCIS BACON 


Every aspect of this book was influenced by the desire to present calculus not 
merely as a prelude to but as the first real encounter with mathematics. 
Since the foundations of analysis provided the arena in which modern modes 
of mathematical thinking developed, calculus ought to be the place in which 
to expect, rather than avoid, the strengthening of insight with logic. In addi- 
tion to developing the students’ intuition about the beautiful concepts of 
analysis, it is surely equally important to persuade them that precision and 
rigor are neither deterrents to intuition, nor ends in themselves, but the 
natural medium in which to formulate and think about mathematical 
questions. 

This goal implies a view of mathematics which, in a sense, the entire book 
attempts to defend. No matter how well particular topics may be developed, 
the goals of this book will be realized only if it succeeds as a whole. For this 
reason, it would be of little value merely to list the topics covered, or to men- 
tion pedagogical practices and other innovations. Even the cursory glance 
customarily bestowed on new calculus texts will probably tell more than any 
such extended advertisement, and the teacher with strong feelings about 
particular aspects of calculus will know just where to look to see if this book 
fulfills his requirements. 

A few features do require explicit comment, however. Of the twenty-nine 
chapters in the book, two (starred) chapters are optional, and the three chap- 
ters comprising Part V have been included only for the benefit of those 
students who might want to examine on their own a construction of the real 
numbers. Moreover, the appendices to Chapters 3 and 11 also contain optional 
material. 

The order of the remaining chapters is intentionally quite inflexible, since 
the purpose of the book is to present calculus as the evolution of one idea, not 
as a collection of “‘topics.’? Since the most exciting concepts of calculus do not 
appear until Part III, I should point out that Parts I and II will probably 
require less time than their length suggests—although the entire book covers 
a one-year course, the chapters are not meant to be covered at any uniform 
rate. A rather natural dividing point does occur between Parts II and III, so 
it is possible to reach differentiation and integration even more quickly by 
treating Part II very briefly, perhaps returning iater for a more detailed 
treatment. This arrangement corresponds to the traditional organization 
of most calculus courses, but I feel that it will only diminish the value of the 
book for students who have seen a small amount of calculus previously, and 
for bright students with a reasonable background. 

The problems have been designed with this particular audience in mind. 
They range from straightforward, but not overly simple, exercises: which 
develop basic techniques and test understanding of concepts, to problems of 
considerable difficulty and, I hope, of comparable interest. ‘There are about 
625 problems in all. Those which emphasize manipulations usually contain 
many examples, numbered with small Roman numerals, while small letters 


Vil 


Vill 


Preface 


are used to label interrelated parts in other problems. Some indication of 
relative difficulty is provided by a system of starring and double starring, but 
there are so many criteria for judging difficulty, and so many hints have been 
provided, especially for harder problems, that this guide is not completely 
reliable. Many problems are so difficult, especially if the hints are not con- 
sulted, that the best of students will probably have to attempt only those which 
especially interest them; from the less difficult problems it should be easy to 
select a portion which will keep a good class busy, but not frustrated. ‘The 
answer section contains solutions to about half the examples from an assort- 
ment of problems that should provide a good test of technical competence. A 
separate answer book contains the solutions of the other parts of these prob- 
lems, and of all the other problems as well. Finally, there is a Suggested Read- 
ing list, to which the problems often refer, and an index of symbols. 

I am grateful for the opportunity to mention the many people to whom I 
owe my thanks. Mrs. Jane Bjorkgren performed prodigious feats of typing) 
that compensated for my fitful production of the manuscript. Mr. Richard 
Serkey helped collect the material which provides historical sidelights in the 
problems, and Mr. Richard Weiss supplied the answers appearing in the 
back of the book. I am especially grateful to my friends Michael Freeman, 
Jay Goldman, Anthony Phillips, and Robert Wells for the care with which 
they read, and the relentlessness with which they criticized, a preliminary 
version of the book. Needless to say, they are not responsible for the deficien- 
cies which remain, especially since I sometimes rejected suggestions which 
would have made the book appear suitable for a larger group of students. 
I must express my admiration for the editors and staff of W. A. Benjamin, Inc., 
who were always eager to increase the appeal of the book, while recognizing 
the audience for which it was intended. 

The inadequacies which preliminary editions always involve were gallantly 
endured by a rugged group of freshmen in the honors mathematics course at 
Brandeis University during the academic year 1965-1966. About half of this 
course was devoted to algebra and topology, while the other half covered 
calculus, with the preliminary edition as the text. It is almost obligatory in 
such circumstances to report that the preliminary version was a gratifying 
success. This is always safe—after all, the class is unlikely to rise up in a body 
and protest publically—but the students themselves, it seems to me, deserve 
the right to assign credit for the thoroughness with which they absorbed 
an impressive amount of mathematics. I am content to hope that some 
other students will be able to use the book to such good purpose, and with 
such enthusiasm. 


Waltham, Massachusetts MICHAEL SPIVAK 
February 1967 


PREFACE TO THE 
WORLD STUDENT SERIES EDITION 


In American universities, ‘“‘calculus’”? courses are so varied in content and 
viewpoint that only tradition dictates that the same title be given to all. Some 
of these courses cover only the most rudimentary techniques, sedulously shun- 
ning all theory. At the other extreme, so-called “‘honors” courses concentrate 
on the foundations, and emphasize the theoretical importance of the basic 
processes of calculus, which have often already been seen by the students in 
their high school classes. This book was written for just such courses. In the 
United States, they are so rare that I felt constrained to begin my original 
preface with a lengthy defense, or apology for the fact that it was really written 
for the very type of course which in Great Britain and Europe is quite standard, 
and usually called ‘‘An introduction to Real analysis’. From that long prologue 
I would like to salvage only one thought, and once again express my feeling 
that the foundations of analysis provide one of the most beautiful instances of 
the welding together of mathematical intuition and mathematical rigor, and 
that ignoring either one only weakens, rather than strengthens, the other. 

A few comments about the contents and arrangement should be made, 
especially for the student who is learning the material on his own. Of the twenty- 
nine chapters in the book, two (starred) chapters are “‘optional’’, and the three 
chapters comprising Part V might be thought of in the same way. Moreover, 
the appendices to Chapters 3 and 11 also contain “optional” material. Aside 
from this, the order of the remaining chapters 15 intentionally quite inflexible, 
since the purpose of the book is to present calculus as the evolution of one idea, 
not as a collection of “‘topics’’. Since the most exciting concepts of calculus 
do not appear until Part III, I should point out that Parts [ and II will probably 
require less time than their length suggests, and although the entire book 
covers a one-year course, the chapters are not meant to be covered at any 
uniform rate. A rather natural dividing point does occur between Parts II and 
III, soitis possible to reach differentiation and integration even more quickly by 
reading Part II very briefly, perhaps returning later for more detailed study. But 
I personally feel that the route set forth here, though it requires perseverance 
at the beginning, will be the most rewarding. 

The problems, without which any mathematics text is incomplete, range from 
straightforward (but not overly simple) exercises which develop basic tech- 
niques and test understanding of concepts, to problems of considerable dif- 
ficulty and, I hope, of comparable interest. There are about 625 problems in all. 
Those which emphasize manipulations usually contain many examples, 
numbered with small Roman numerals, while small letters are. used to label 
interrelated parts in other problems. Some indication of relative difficulty is 
provided by a system of starring and double starring, but there are so many 
criteria for judging difficulty, and so many hints have been provided, especially 
for harder problems, that this guide is not completely reliable. Many problems 
aré so difficult, especially if the hints are not consulted, that the best of students 
will probably have to attempt only those which especially interest them. From 
the less difficult problems it should be easy to select a portion which will be 


ix 


% 


Preface to the World Student Series edition 


challenging, but not frustrating. The Answer Section contains solutions to 
about half the examples from an assortnient of problems that should provide a 
good test of technical competence. A separate answer book, Supplement to 
Calculus, contains the solutions of the other parts of these problems, and of all 
the other problems as well. Finally, there is a Suggested Reading iist, to which 
the problems often refer, and a Glossary of Symbols. 

Once again I will take the opportunity to mention the many peopie to whom 
I owe my thanks. Mrs. Jane Bjorkgren performed prodigious ieats of typing 
that compensated for my fitful production of the manuscript. Mr. Richard 
Serkey helped collect the material which provided historical sidelights in the 
problems, and Mr. Richard Weiss supplied the answers appearing in the back 
of the book. I am especially grateful to my friends Michael Freeman, Jay 
Goldman, Anthony Phillips, and Robert Wells for the care with which they 
read, and the relentlessness with which they criticized, a preliminary version 
of the book. 

The inadequacies which preliminary editions always involve were gallantly 
endured by a rugged group of freshmen in the honors mathematics course at 
Brandeis University during the academic year 1965-1966. It is almost obli- 
gatory in such circumstances to report that the preliminary version was a 
gratifying success. This is always safe—after all, the class is unlikely to rise up 
in a body and protest publicly —but the students themselves, it seerns to me, 
deserve the right to assign credit for the thoroughness with which they absorbed 
an impressive amount of mathematics. I am content to hope that some other 
students will be able to use the book to such good purpose, and with such 
enthusiasm. 


Berkeley, California MICHAEL SPIVAK 


CONTENTS 


PART I Prelogue 


1 Basic Properties of Numbers 3 
2 Numbers of Various Sorts 21 
PART II Foundations 
3 Functions 37 
Appendix. Ordered Pairs 52 
4 Graphs 54 
5 Limits 72 
6 Continuous Functions 93 
7 Three Hard Theorems 100 
8 Least Upper Bounds 111 


PART II Derivatives and Integrals 


9 
10 
11 


12 
13 
14 
15 
*16 
17 
18 


Derivatives 125 

Differentiation 144 

Significance of the Derivative 163 

Appendix. Cenvexity and Concavity 191 

Inverse Functions 200 

Integrals 214 

The Fundamental Theorem of Calculus 240 
The Trigonometric Functions 256 

π is Irrational 277 

The Logarithm and Exponential Functions 283 
Integration in Elementary Terms 302 


PART IV Infinite Sequences and Infinite Series 


19 
Ἐ20 


Approximation by Polynomial Functions 299 
eis Transcendental 362 

Infinite Sequences 372 

Infinite Series 388 

Uniform Convergence and Power Series 412 
Complex Numbers 433 

Complex Functions 448 

Complex Power Series 462 


xil 


Contents 


PART V 


Epilogue 
27 Fields 487 
28 Construction of the Real Numbers 494 
29 Uniqueness of the Real Numbers 507 
Suggested Reading 513 
Answers (to selected problems) 523 
Glossary of Symbols 571 
Index 575 


rset O 


PROLOGUE 


To be conscious that 
you are ignorant 15 a great step 
to knowledge. 


BENJAMIN DISRAELI 


CHAPTER 


BASIC PROPERTIES OF NUMBERS 


The title of this chapter expresses in a few words the mathematical knowledge 
required to read this book. In fact, this short chapter 15 simply an explanation 
of what is meant by the ‘‘hasic properties of numbers,”’ all of which—addition 
and multiplication, subtraction and division, solutions of equations and 
inequalities, factoring and other algebraic manipulations—are already 
familiar to us. Nevertheless, this chapter is not a review. Despite the familiarity 
of the subject, the survey we are about to undertake will probably seem quite 
novel; it does not aim to present an extended review of old material, but to 
condense this knowledge into a few simple and obvious properties of numbers. 
Some may even seem too obvious to mention, but a surprising number of 
diverse and important facts turn out to be consequences of the ones we shall 
emphasize. 

Of the twelve properties which we shal] study in this chapter, the first nine 
are concerned with the fundamental operations of addition and multiplica- 
tion. For the moment we consider only addition: this operation is performed 
on a pair of numbers—the sum a + ὁ exists for any two given numbers a and ὦ 
(which may possibly be the same number, of course). It might seem reason- 
able to regard addition as an operation which can be performed on several 
numbers at once, and consider the sum ai + °°: +a, of n numbers 
a1, . . . ) Qn, a8 a basic concept. It is more convenient, however, to consider 
addition of pairs of numbers only, and to define other sums in terms of sums 
of this type. For the sum of three numbers a, ὁ, and ὁ, this may be done in two. 
different ways. One can first add ὁ and c, obtaining ὁ + ¢, and then add a to 
this number, obtaining a + (6 +c); or one can first add a and 6, and then 
add the sum a + ὁ toc, obtaining (a + ὁ) +c. Of course, the two compound 
sums obtained are equal, and this fact is the very first property we shall list: 


(P1) Ifa, 6, andc are any numbers, then 
at (b+c) = (a+b) +6. 


The statement of this property clearly renders a separate concept of the sum 
of three numbers superfluous; we simply agree that a + ὁ + ¢ denotes the 
number a+ (6 +c) = (a+ 6) +c. Addition of four numbers requires 
similar, though slightly more involved, considerations. The symbol a τ ae 
c + dis defined to mean 
Cl): an) so); 
or (2) (a+ (6+c)) +d, 
or (3) a+ ((6+c) +4), 
or (4) 4- (ὦ -Ἐ (ὁ -Ἐ α)), 
or (5) (α - 6) -Ἐ (¢ + a). 


4 Prologue 


This definition is unambiguous since these numbers are all equal. Fortunately, 
this fact need not be listed separately, since it follows from the property Pi 
already listed. For example, we know from P1 that 


Garb) te = α-Ὁ (ὁ -Ἐ ὦ), 


and it follows immediately that (1) and (2) are equal. The equality of (2) and 
(3) is a direct consequence of P1, although this may not be apparent at first 
sight (one must let ὁ + ¢ play the role of ὁ in P1, and d the role of c). The 
equalities (3) = (4) = (5) are also simple to prove. 

It is probably obvious that an appeal to P1 will also suffice to prove the 
equality of the 14 possible ways of summing five numbers, but it may not be so 
clear how we can reasonably arrange a proof that this is so without actually 
listing these 14 sums. Such a procedure is feasible, but would soon cease to be 
if we considered collections of six, seven, or more numbers; it would be totally 
inadequate to prove the equality of all possible sums of an arbitrary finite 
collection of numbers a1, . . . , dn. This fact may be taken for granted, but 
for those who would like to worry about the proof (and it is worth worrying 
about once) a reasonable approach is outlined in Problem 23. Henceforth, we 
shall usually make a tacit appeal to the results of this problem and write sums 
a, +--+ + +a, with a blithe disregard for the arrangement of parentheses. 

The number 0 has one property so important that we list it next: 


(P2) ‘If ais any number, then 
a+0=0+a= a4. 
An important role is also played by 0 in the third property of our list: 
(P3) For every number a, there is a number —a such that 
a+ (—a) = (-a)+a=0. 


Property P2 ought to represent a distinguishing characteristic of the number 
0, and it is comforting to note that we are already in a position to prove 
this. Indeed, if a number «x satisfies 


atx=a 


for any one number a, then x = 0 (and consequently this equation also holds 
for all numbers a). The proof of this assertion involves nothing more than 
subtracting a from both sides of the equation, in other words, adding —a to 
both sides; as the following detailed proof shows, all three properties P1—P3 
must be used to justify this operation. 


If at+x=a, 
then (—a) + (a+) = (—a) +a = 0; 
hence ((—a) +a) +x = 0; 
hence O+*x =0; 
hence x = 0. 


Basic Properties of Numbers 5 


As we have just hinted, it is convenient to regard subtraction as an operation 
derived from addition: we consider a — ὁ to be an abbreviation for a + (-- δ). 
It is then possible to find the solution of certain simple equations by a series 
of steps (each justified by P1, P2, or P3) similar to the ones just presented for 
the equation a + x = a. For example: 


If | x+3=5, 
then (x +3) + (-3) = 5 + (—3); 
hence x« + (3+ (—3)) =5-3=2 
hence x+0 = 2; 
hence x = 2. 


Naturally, such elaborate solutions are of interest only until you become con- 
vinced that they can always be supplied. In practice, it is usually just a waste 
of time to solve an equation by indicating so explicitly the reliance on proper- 
ties P1, P2, and P3 (or any of the further properties we shall list). 

Only one other property of addition remains to be listed. When considering 
the sums of three numbers a, 6, and c, only two sums were mentioned: (a + ὁ) 
+-¢ and a + (6 +). Actually, several other arrangements are obtained if 
the order of a, ὁ, and ¢ is changed. That these sums are all equal depends on 


(P4) Ifa@and ὁ are any numbers, then 
atb=b-+a. 


The statement of P4 is meant to emphasize that although the operation of 
addition of pairs of numbers might conceivably depend on the order of the 
two numbers, in fact it does not. It is helpful to remember that not all opera- 
tions are so well behaved. For example, subtraction does not have this prop- 
erty: usually a — ὁ τέ ὁ — a. In passing we might ask just when a — ὁ does 
equal ὁ — a, and it is amusing to discover how powerless we are if we rely 
only on properties P1—P4 to justify our manipulations. Algebra of the most 
elementary variety shows that a — ὁ = ὁ — aonly whena = ὁ. Nevertheless, 
it is impossible to derive this fact from properties P1—P4; it is instructive to 
examine the elementary algebra carefully and determine which step(s) 
cannot be justified by P1-P4. We will indeed be able to justify all steps in 
detail when a few more properties are listed. Oddly enough, δος, the 
crucial property involves multiplication. 

The basic properties of multiplication are fortunately so similar to those for 
addition that little comment will be needed; both the meaning and the conse- 
quences should be clear. (As in elementary algebra, the product of a and ὃ 
will be denoted by α΄ ὦ, or simply ab.) 


(P5) Ifa, ὁ, and c are any numbers, then 
a‘ (bc) = (α " δ)" ec. 
(P6) Ifais any number, then 


a‘l=1l‘a=a. 


6 Prologue 


Moreover, 1 # 0. 


(The assertion that 1 σέ 0 may seem a strange fact to list, but we have to 
list it, because there is no way it could possibly be proved on the basis of 
the other properties listed—these properties would all hold if there were 
only one number, namely, 0.) 


(P7) For every number a # 0, there is a number a™ such that 


1. πὶς- 


ag = =a ee . 


(P8) Ifaand 6 are any numbers, then 
a'b=b-a. 


One detail which deserves emphasis is the appearance of the condition 
a τέ 0 in P7. This condition is quite necessary; since 0: ὁ = Ὁ for all numbers 
ὁ, there is no number 07! satisfying 0: 07! = 1. This restriction has an impor- 
tant consequence for division. Just as subtraction was defined in terms of 
addition, so division is defined in terms of multiplication: the symbol a/6 
means a+ 67}. Since 07! is meaningless, 2/0 is also meaningless—division by 0 
is always undefined. 

Property P7 has two important consequences. If a: ὦ = α΄ Ὁ, it does not 
necessarily follow that ὁ = c; for if a = 0, then both a: ὁ and a‘ are 0, no 
matter what ὁ and ¢ are. However, if a # 0, then 4 = c; this can be deduced 
from P7 as follows: 


If a’'b=a'‘canda #Q, 
then a '!:(a-b) =a! (α" ὁ); 
hence: 5.4) "6 SG, τ) 
hence 1‘d5=1°¢; 

hence b=. 


It is also a consequence of P7 that if a:b = 0, then either a = 0 or ὁ = 0. 
In fact, 


if a‘b=OQanda # 0, 
then a1: (a'b) = 0; 
hence (a !:a)'b - 0; 
hence 1. - Ὁ; 
hence b = 0. 
(It may happen that a = 0 and ὦ = 0 are both true; this possibility is not 


excluded when we say “either a = 0 or ὁ = 0”; in mathematics ‘‘or”’ is 
always used in the sense of ‘‘one or the other, or both.’’) 

This latter consequence of P7 is constantly used in the solution of equations. 
Suppose, for example, that a number x is known to satisfy 


(x — 1) — 2) = 0. 


Then it follows that either x — 1 = 0 or x + 2 = 0; hence x = 1 or x = 2. 
On the basis of the eight properties listed so far it is still possible to prove 


Basic Properties of Numbers 7 


very little. Listing the next property, which combines the operations of addi- 
tion and multiplication, will alter this situation drastically. 


(P9) Ifa, ὁ, and c are any numbers, then 
a‘ (b+c) =a'b+are. 
(Notice that the equation (ὁ +c): a = δ' α -Γ δ΄ ais also true, by P8.) 


As an example of the usefulness of P9 we will now determine Just when a — ὁ 
=b—a: 


If a—-b=b-—a, 

then (α -- δὴ) -- ὁ Ξ (ὁ -- αἡ +b =6+ (ὁ — <a); 
hence a=b+b)-—<a4; 

hence α--α- ( -- ὃ -- αὐ ἡ Ξξ ὁ Ὁ ὁ. 


a 
Consequently «(1 τ 1) Ξ δ: (1 + 1), 
and therefore a= b. 


A second use of P9 is the justification of the assertion a - 0 = 0 which we have 
already made, and even used in a proof on page 6 (can you find where?). 
This fact was not listed as one of the basic properties, even though no proof 
was offered when it was first mentioned. With Pi-P8 alone a proof was not 
possible, since the number 0 appears only in ΡΖ and P53, which concern addi- 
tion, while the assertion in question involves multiplication. With P9 the 
proof is simple, though perhaps not obvious: We have 


a‘O0O+a‘'0=a:'(04+ 0) 
= ἃ Ὁ: 


as we have already noted, this immediately implies (by adding (a- 0) to 
both sides) that a: 0 = 0. 

A series of further consequences of P9 may help explain the somewhat 
mysterious rule that the product of two negative numbers 18 positive. ‘To begin 
with, we will establish the more easily acceptable assertion that (--αὴ "ὁ = 
--- (α - ὁ). To prove this, note that 


(—a):b+a:‘b=[(-a) -Ἡ α] "ὁ 
Ξε τ 
= 0. 


It follows immediately (by adding —(a- ὁ) to both sides) that (--α) "ὁ Ξ 
-- (α - ὁ). Now note that 


(—a) - (—b) + [--α- b)]) = {--α) " (--ὃ) + (--α) "ὁ 
(=@) i=) 
(=a) 70 

= 0. 


I 


i 


Consequently, adding (a - δ) to both sides, we obtain 
(—a):(—b) =a ὁ. 


8 Prologue 


The fact that the product of two negative numbers is positive is thus a conse- 
quence of P1—P9. In other words, if we want P7 to P9 to be true, the rule for the 
product of two negative numbers is forced upon us. 

The various consequences of P9 examined so far, although interesting and 
important, do not really indicate the significance of P9; after all we could have 
listed each of these properties separately. Actually, P9 is the justification for 
almost all algebraic manipulations. For example, although we have shown 
how to solve the equation 


(x — 1) — 2) = 0, 


we can hardly expect to be presented with an equation in this form. We are 
more likely to be confronted with the equation 


x? — 3x +2=0. 


The “factorization” x? — 3x +2 = (x — 1)(x — 2) is really a triple use 
of P9: 
( — 1) ἃ -- 2) = χα τ @ — 2) + (—1)- (ἃ -- 2) 
Oe - χ' (--2) a) ( -Ῥ (a) (= 2) 
ΞΕ ἘΠ || Ξ 2) 1) ee 
= x? — 3x + 2. 


A final illustration of the importance of P9 is the fact that this property is 
actually used every time one multiplies arabic numerals. For example, the 
calculation 

13 
x 24 

52 
26 
312 


is a concise arrangement for the following equations: 


13-24 = 13. (2: 10 -- 4) 
= 13.2.10 -Ἐ 13: 4 
= 26: 10 + 52. 


(Note that moving 26 to the left in the above calculation is the same as writing 
26 - 10.) The multiplication 13 + 4 = 52 uses P9 also: 


13°4 = (1-104 3). 4 
1:10°44+3°-4 
4-104 12 
4-10+1-10+2 
(4+1):10+2 
=5:10+2 

= 52. 


Basic Properties of Numbers 9 


The properties P1-P9 have descriptive names which are not essential to 
remember, but which are often convenient for reference. We will take this 
opportunity to list properties P1-P9 together and indicate the names by 
which they are commonly designated. 


(P1) (Associative law for addition) a+ (ὁ + c) = (a+b) +. 

(P2) (Existence of an additive atO=O0+a4a=a. 
identity) 

(P3) (Existence of additive inverses) a + (—a) = (--α) ta = 0. 

(P4) (Commutativelawforaddition) a+b - ὁ +a. 

(P5) (Associative law for multiplica- a: (0° εὐ aby 2, 


tion) 

(P6) (Existence of a multiplicative α΄1 τ 1: ἃ Ξ α; 1 #0. 
identity) 

(P7) (Existence of multiplicative a‘ai=a':a = 1, fora # 0. 
inverses) 

(P8) (Commutative law for multi- a‘b=b-a. 
plication) | 

(P9) (Distributive law) a‘ (6+c) =a:b+are. 


The three basic properties of numbers which remain to be listed are con- 
cerned with inequalities. Although inequalities occur rarely in elementary 
mathematics, they play a prominent role in calculus. The two notions of 
inequality, a < ὁ (ais less than δ) anda > ὁ (ais greater than b), are intimately 
related: a « ὦ means the same as ὁ > a (thus 1 < 3 and3 > 1 are merely two 
ways of writing the same assertion), The numbers a satisfying a > Ὁ are 
called positive, while those numbers a satisfying a < 0 are called negative. 
While positivity can thus be defined in terms of <, it 15 possible to reverse 
the procedure: a < ὁ can be defined to mean that ὁ — ais positive. In fact, 
it is convenient to consider the collection of all positive numbers, denoted by 
P, as the basic concept, and state all properties in terms of P: 


(P10) (Trichotomy law) For every number a, one and only one of the 
following holds: 


G) a=0, 
(ii) a is in the collection P, 
(iii) —a is in the collection P. 


(P11) (Closure under addition) If a and ὁ are in P, thena + ὁ is in P. 


(P12) (Closure under multiplication) If a and ὁ are in P, then a° b is 
in P. 


10 Prologue 


These three properties should be complemented with the following defini- 
tions: 


a>b if a-— bisinP; 
ab. τ Ὁ» α; 

αδὺ if a>bora=b; 
αὐ if a<bora=b.* 


Note, in particular, that α > 0 if and only if a is in P. 

All the familiar facts about inequalities, however elementary they may 
seem, are consequences of P10—P12. For example, if @ and 6 are any two 
numbers, then precisely one of the following holds: 


GQ) a—-b=9), 
(1) a — ὁ ἰ5 in the collection P, 
(111) —(a — b) = ὁ — ais in the collection P. 


Using the definitions just made, it follows that precisely one of the following 
holds: 


G) a= ὁ, 
(1) a> Bb, 
(ili) ὁ > a. 


A slightly more interesting fact results from the following manipulations. 
Ifa « ὁ, so that ὁ — ais in P, then surely (ὁ +c) — (a +c) is in P; thus, if 
a<b,thena+c<6-+c. Similarly, suppose a < 6 and ὁ < c. Then 


b—aisinP, 
and ὁ — dbisin P, 
so c¢c~-a=(c—b)4+ (ὦ — a) isinP. 


This shows that if a < ὁ and ὁ < ¢, then a < ¢. (The two inequalities a < ὁ 
and ὁ < ¢ are usually written in the abbreviated form a < ὁ < c, which has 
the third inequality a < ¢ almost built in.) 

The following assertion is somewhat less obvious: If a < 0 and ὁ < 0, then 
ab > 0. The only difficulty presented by the proof is the unraveling of defini- 
tions. The symbol a < 0 means, by definition, 0 > a, which means 0 — a = 
—a isin P. Similarly —6 is in P, and consequently, by P12, (—a)(—4) = αὖ 
isin ἢ. Thus ab > 0. 

The fact that ab > 0 if a> 0, 6> 0 and also if a < 0, ὁ < 0 has one 
special consequence: a? > 0 if @ # 0. Thus squares of nonze:.. numbers are 


* There is one slightly perplexing feature of the symbols > and <. The statements 


1-1<3 
1.12 


are both true, even though we know that < could be replaced by « in the first, and by = 
in the second. This sort of thing is bound to occur when < is used with specific numbers; the 
usefulness of the symbol is revealed by a statement like Theorem 1—here equality holds for 
some values of a and ὦ, while inequality holds for other values. 


THEOREM 1 


PROOF 


Basic Properties of Numbers 11 


always positive, and in particular we have proved a result which might have 
seemed sufficiently elementary to be included in our list of properties: 1 > 0 
(since 1 = 17). 

The fact that ~a > 0 if a < 0 is the basis of a concept which will play an 
extremely important role in this book. For any number a, we define the 
absolute value |2! of a as follows: 


ay. aw ad 

—a, α “0. 
Note that |a| is always positive, except when a = 0. For example, we have 
1-3) = 3, (7) =7, Pp +V2—-V3j=1 4vV2— V3, and [1 - Ν2 -- 
Vv 10| S710 = V2 =e 1ὰ general, the most straightforward approach to 
any problem involving absolute values requires treating several cases sepa- 


rately, since absolute values are defined by cases to begin with. This approach 
may be used to prove the following very important fact about absolute values. 


For all numbers a and ὁ, we have 


ja + 0) < {al + 4}. 
We will consider 4 cases: 

(1) «20, B20; 

(2) a20, b< 0); 

(3), as 0, ὁπ; 

(4) a<0, b<0. 


In case (1) we also have a + ὁ > 0, and the theorem is obvious; in fact, 
lat ὁ] =a +b = [αἱ + ἰδ], 


so that in this case equality holds. 
In case (4) we have a + ὁ < 0, and again equality holds: 


la + 6] = —(a+6) = —a + (—6) = [αἱ + 6. 
In case (2), when a > 0 and ὁ < 0, we must prove that 
la + bh} <a-—b. 


This case may therefore be divided into two subcases. If a + ὁ => 9, then we 
must prove that 


atb<a-J, 
1.€., b <—p, 
which is certainly true since ὁ is negative and —é is positive. On the other 
hand, if a + ὁ < 0, we must prove that 
—a-b<a-—), 


1.€., —asa, 


which is certainly true since a is positive and —a 18 negative. 


12 Prologue 


Finally, note that case (3) may be disposed of with no additional work, by 
applying case (2) with a and ὁ interchanged. ἢ 


Although this method of treating absolute values (separate consideration 
of various cases) is sometimes the only approach available, there are often 
simpler methods which may be used. In fact, it is possible to give a much 
shorter proof of ‘Theorem 1; this proof is motivated by the observation that 


la| = να 


(Here, and throughout the book, x denotes the positive square root of x; this 
symbol is defined only when x > 0.) We may now observe that 


(la + 6)" = (a -ἰ δ)" = a? + 2ab +B? 
ae te 2lal 1b) eB 
la]? + 2a] - ἰδ] + 6] 
= (lal + [δ]. 
From this we can conclude that |a + 6] < |a| + [6] because x? < y? implies 


x « γ, provided that x« and y are both non-negative; a proof of this fact is left 
to the reader (Problem 5). 
One final observation may be made about the theorem we have just 
proved: a close examination of either proof offered shows that 
ja + ὁ] = Ια] + δ] 


if a and ὁ have the same sign (i.e., are both positive or both negative), or if 
one of the two is 0, while 


| IA 


la + ὁ] < [α] + |d| 


if a and ὁ are of opposite signs. 

We will conclude this chapter with a subtle point, neglected until now, 
whose inclusion is required in a conscientious survey of the properties of 
numbers. After stating property P9, we proved that a — ὦ = ὁ — a implies 
a = b. The proof began by establishing that 


α' (1 - 1) Ξ 8. (4 +1), 


from which we concluded that a = ὁ. This result is obtained from the equation 
α' (1 -Ἡ 1) = 6: (1 +1) by dividing both sides by 1 + 1. Division by 0 
should be avoided scrupulously, and it must therefore be admitted that the 
validity of the argument depends on knowing that 1 + 1 # 0. Problem 24 
is designed to convince you that this fact cannot possibly be proved from 
properties P1—-P9 alone! Once P10, P11, and P12 are available, however, the 
proof is very simple: We have already seen that 1 > 0; it follows that1 + 1 > 
0, and in particular 1 + 1 = 0. 

This last demonstration has perhaps only strengthened your feeling that it 
is absurd to bother proving such obvious facts, but an honest assessment of our 
present situation will help justify serious consideration of such details. In 


Basic Properties of Numbers 13 


this chapter we have assumed that numbers are familiar objects, and that 
P1—P12 are merely explicit statements of obvious, well-known properties of 
numbers. It would be difficult, however, to justify this assumption. Although 
one learns how to “work with’ numbers in school, just what numbers are, 
remains rather vague. A great deal of this book is devoted to elucidating the 
concept of numbers, and by the end of the book we will have become quite 
well acquainted with them. But it will be necessary to work with numbers 
throughout the book. It is therefore reasonable to admit frankly that we do 
not yet thoroughly understand numbers; we may still say that, in whatever 
way numbers are finally defined, they should certainly have properties 
P1-P12. 

Most of this chapter has been an attempt to present convincing evidence 
that P1—P12 are indeed basic properties which we should assume in order to 
deduce other familiar properties of numbers. Some of the problems (which 
indicate the derivation of other facts about numbers from P1—P12) are offered 
as further evidence. It is still a crucial question whether P1-—P12 actually 
account for all properties of numbers. As a matter of fact, we shall soon see 
that they do not. In the next chapter the deficiencies of properties Pi-P12 will 
become quite clear, but the proper means for correcting these deficiencies is 
not so easily discovered. The. crucial additional basic property of numbers 
which we are seeking is profound and subtle, quite unlike P1—P12. The dis- 
covery of this crucial property will require all the work of Part II of this book. 
In the remainder of Part I we will begin to see why some additional property 
is required; in order to investigate this we will have to consider a little more 
carefully what we mean by ‘‘numbers.”’ 


PROBLEMS 
1. Prove the following: 


(i) If ax = a for some number a ¥ 0, then x = 1. 

(1). Sy ae ye): 

(iii) Ifx? = y*?, then x = yorx = —y. 

(iv) a3 — y8 = (x — y)(x? + αν Ἐ γ5. 

(Ve Sy Sy ae ae ea eg 

(vi) x8 + γ᾽ = (xn + y)(x? — xy + γῇ). (There is a particularly easy 
way to do this, using (iv), and it will show you how to find a fac- 
torization for x” + y” whenever n is odd.) 


2. What is wrong with the following ‘‘proof’’?? Let x = y. Then 


x? == XY, 
xi — yh = xy — γ", 
(x + y)(x —y) = ye — y), 
x+t+y =y, 
27 = J; 
2=1. 


14 Prologue 


2. 


4. 


5. 


Prove the following: 


(i) 5 = τὸ if b,c γέ 0. 
C 
ἃ - ὁ | 
Gi) 5:13 - 551 50 ite, a x0 


(iti) (ab)~! = ab}, if a, 6 σε 0. (To do this you must remember the 
defining property of (αὐ) 1.) 


Ci eit hae: 
h ad oh 


d 
(v) S/o =, 168,0, ἀ #0. 
be 


(vi) Ifb,d #0, then = 


: if and only if ad = bc. Also determine when 


Find all numbers x for which 


i Mase gsi 

Gi) 5 — x < 8. 

Guy Sage 

(iv) (x — 1)(x — 3) > 0. (When is a product of two numbers posi- 
tive?) 

(v) x? —-—2x+2> 0. 

(vi)? ἀξ χ Ἐ1» 2. 

(vii) x2? —x - 10.» 16. 

(ν1}) <?-+x+1> 0. 

(ix) (x — w)(x - 5) -- 3) > 0. 

(x) (ἃ — ῶα -- V2) > 0. 

(ΧΙ) 25 « ὃ. 

(xil) x + 3% « 4. 


i 


1--χ 


1 
(xiii) -— + > 0. 
x 


ee ae ae 
Ny) Sg 


Prove the following: 


Gi) Ifa<bande<d,thenate<bt+d. 
Gi) Ifa <6}, then --ῦ < —a. 


ἘΖῚ 


*8. 


Basic Properties of Numbers 15 


(iii) Ifa <bandc>d,thena—c<6b-—d. 
(iv) Ifa <6andc > 0, then ac < be. 

(v) Ifa<éandc < 0, then ac > bc. 

(vi) Ifa> 1, then a? > a. 

(vii) [0 < @< 1, then:a* <a. 

(viii) If0 <a<band0 <c « ὦ, thenac < bd. 


(ix) If0 <a < bd, then a? < 65. (Use (viil).) 


(x) Ifa,6 > 0 and a? < 65", thena < ὁ. (Use (ix), backwards.) 


Prove that if 0 < a < ὁ, then 
a πὸ 
a<Vab <-> <b. 


Notice that the inequality Vab < (a + 6)/2 holds for a, ὁ > 0, without 

the additional assumption a < ὁ. A generalization of this fact occurs in 

Problem 2-20. 

(a) Prove that if x” = y” and n is odd, then x = y. Hint: First explain 
why it suffices to consider only the case x, y > 0; then show that 
x <y and y > x are both impossible. 

(Ὁ) Prove that if x” = y" and 7 1s even, then x = yor x = —y. 

Although the basic properties of inequalities were stated in terms of the 

collection P of all positive numbers, and < was defined in terms of P, 

this procedure can be reversed. Suppose that P10-P12 are replaced by 


(P10) For any numbers a and ὁ one, and only one, of the follow- 
ing holds: 


() <a 2b, 
(ii) a < ὁ, 
Gil) ὁ ἃ, 
(Ρ΄11) For any numbers a, 4, and c, if a « ὁ and ὁ « ο, then 
2 ΦΞ 
(P’12) For any numbers a, 6, andc,ifa « 6,thena+ec<b +e. 
(P’13) For any numbers a, ὁ, and ¢, if ἃ < ὁ and 0 <c, then 
ac < be. 


Show that P10-P12 can then be deduced as theorems. 
Express cach of the following with at least one less pair of absolute value 
signs. 


iy, ἡ δ τ νη; 

(ii) ,(ἀ + 4] — (ἃ ἐπι δ}: 

(iii) |(la + ὁ| + [εἰ — la +44 2)}. 
(iv) |x? — 2xy τ: 7 7 
(v) (V2 + V3) —|V5 -- V7))I. 


16 Prologue 


10. 


11. 


12. 


13. 


Express each of the following without absolute value signs, treating 
various cases separately when necessary. 


(i) ja Ἡ ὁ] — (dl. 
(1) |(Jx| — 1)]. 
(iii) |x| — |x?|. 

(iv) a — |(a — |a})|. 


Find all numbers x for which 


Gi. ἤιε δι 
(ii) |x -- 3| «8. 
(ii) ke + 4} <2. 
(iv) |x — 1] + |x — 2] > 1. 
(yy a de eee Lee, 
(vi) |x —1[ + [5 ἘΠ] «1. 
(vii) |x — 1|- |x +1] = 0. 
(viii) |x — 1{- |x + 2| = 3. 


Prove the following: 


(4) |xy| = [χἱ - |p. 


| 1 
(ii) A | ἘΞ δὶ if x ~ 0. (The best way to do this is to remember what 
x x 
|x|} is.) 
(iii) = =|"), ify κέ 0. 
> 


(iv) |x — y| < |x| + |y|. (Give a very short proof.) 

(v) |x| — ly| < |x — y|. (A very short proof is possible, if you write 
things in the right way.) 

(vi) |(|\x| — [»]}}} < |x — y|. (Why does this follow immediately from 
(v)?) 

(vii) |x + y + 2] < |x] + |y| + |z|. Indicate when equality holds, and 
prove your statement. 


The maximum of two numbers x and y is denoted by max(x, y). Thus 
max(—1, 3) = max(3,3) = 3 and max(—1, —4) = max(—4, —1) 
= —1. The minimum of x and y is denoted by min(x, y). Prove that 


max(x, y) = ἘΣ Ὲ mA, 
min(s,y) -- Ξ:Ὲ} Ξ bal 


Derive a formula for max(x, y, 2) and min(x, y, 2), using, for example, 


max(x, γ, 2) = max(x, max(y, z)). 


Basic Properties of Numbers 17 


14. (a) Prove that [α] = |—a|. (The trick is not to become confused by too 
many cases. First prove the statement for a > 0. Why is it then 
obvious for a < 0?) 
(b) Prove that —b <a “ὁ if,and only if |a| < ὁ. In particular, it 
follows that —|a| < a < |al. 
(c) Use this fact to give a new proof that |a + ὁ] < [αἱ + [6]. 
*15. (a) Use Problem 1 and Problem 7 to prove that if x and y are not both 
0, then 
χ -+ xy + y? € 0, 
xt A x3y + x2y? + xy? + γί κε. 
For every number x ~ 0, each of these expressions is positive for some 
positive number y and also for some negative y (namely, y = +); it 
therefore seems reasonable that the ¥ signs can be replaced by > 
signs. This maneuver is valid, but we are not yet in a position to prove 
this (see Problem 7-9). Parts (b) and (d) of this problem pee a 
direct demonstration that the > signs hold. 


(b) Using the fact that 
x? + Qxy + y? = (x + y)? 2 0, 
show that the assumption x? + xy + y? < 0 leads to a contradiction. 
(c) Show similarly that if x and y are not both 0, then 
4x? + Oxy + 4y? > 0, 
SX Poe oy 0; 


**(d) Show that if x and y are not both 0, then 


OS xy Pay? say? 9 * > 0. 
*16. (a) Show that 


(x + y)? = x? ++ y? only when x = Oory = 0, 
(x + y)? = x3 - γ᾽ only when χα = Oory = Oorx = —y. 


(b) Use Problem 15 to find out when (x + y)* = x4 + γ΄. 

**(c) Find out when (x + γ) = χϑ + γῆ. Hint: From the assumption 
(x + γ)ῦ = x® + y> you should be able to derive the equation 
x3 + 2x2y + 2xy? + γῆ = 0, if xy ¥ 0. This implies that (x + y)* = 
xy + xy? = xy(x + )). 

You should now be able to make a good guess as to when (x + y)” = 


x" + γῆ; the proof is contained in Problem 11-41. 
17. (a) Suppose that 5? — 4ς > 0. Show that the numbers 


—b+VR—4¢ --- Vb? — 4c 
2 2 
both satisfy the equation x? + 4x +c¢ = 0. 


18 Prologue 


(Ὁ) Suppose that 6? — 4c <0. Show that there are no numbers x 
satisfying x? + bx + ὁ = 0; in fact, x? + bx + c¢ > 0 for all x. Hint: 
“Complete the square,” i.e., write x7 + bx +c = (x + 6/2)? +? 

(c) Use this fact to give another proof that if x and y are not both 9, 
then 4° =e ye 0. 

(d) For which numbers ἃ is it true that x? + axy + y? > 0 whenever 
x and y are not both 0? 

(e) Find the smallest possible value of x? + 6x + ¢ and of ax? + bx +c, 
for a ¥ 0. (Use the trick in part (b).) 

18. The fact that a? > 0 for all numbers a, elementary as it may seem, is 
nevertheless the fundamental idea upon which most important in- 
equalities are ultimately based. The great-granddaddy of all inequalities 
is the Schwartz inequality: 


x1y1 + 2). < V xy? x2? Vy? -Ἐ γι. 
The three proofs of the Schwartz inequality outlined below have only 
one thing in common-—their reliance on the fact that a? > 0 for all a. 
(a) Prove the Schwartz inequality by first proving that 
(x1? Ἢ x2") (yi? Hye”) = (xiys + ropa)? + (xiye — x21)? 


(Ὁ) Prove that if x1 = Ay; and χ = Aye for some number ἃ, then 

equality holds in the Schwartz inequality. Prove the same thing if 
γι = yo = 0. Now suppose that γι and y. are not both 0, and that 
there is no number ἃ such that x; = Ay; and x, = Aye. Then 


0 < (λγι coe x1)? + (λγν — Re 
= PCy? + yo") ie 2A(xiy1 = Χ4}2) ai (α τ Ἔ x2"). 
Using Problem 17, complete the proof of the Schwartz inequality. 
Prove the Schwartz inequality by using 2xy < x? + y? (how is this 
derived?) with 


-- 
Ω 
Ne” 


first for 2 = 1 and then for: = 2. 

(d) Deduce, from each of these three proofs, that equality hoids only 
when γι = yo = O or when there is a number ἃ such that x1 = λγὶ 
and Χο ΞΞ Ayo. 


In our later work, three facts about inequalities will be crucial. Although 
proois will be supplied at the appropriate point in the text, a personal assault 
on these problems is infinitely more enlightening than a perusal of a com- 
pletely worked-out proof. ‘The statements of these propositions involve some 
weird numbers, but their basic message is very simple: if x is close enough to %9, 
and y is close enough to yo, then x + y will be close to x9 + yo, and xy will be 


Basic Properties of Numbers 19 


close to xoyo, and 1/y will be close to 1/yo. The symbol “‘e”” which appears in 
these propositions is the fifth letter of the Greek alphabet (‘‘epsilon’’), and 
could just as well be replaced by a less intimidating Roman letter; however, 
tradition has made the use of ¢ almost sacrosanct in the contexts to which 
these theorems apply. | 


19. Prove that if 


mM 


and |y — γοὶ <= 


Nola 
fh 


then 
ea) = ee yo) | 
(ἐξ yg) ee 
*20. Prove that if 
| | ( Ξ ᾿ 4! 
Χ ΞΕ ΘΠ ee ) an .) τ ¥ol < 
2(\yo] + 1) 
then [xy — xoyol < €. 
(The notation “min” was defined in Problem 13, but the formula pro- 
vided by that problem is irrelevant at the moment; the first inequality 
in the hypothesis just means that 


ἜΒΒΕῚ ae 
2(|xol τς 1) 


-- 


< ——-———~ and |x — xl < 1; 
2(\yol + 1) 


|x = Xo 
at one point in the argument you will need the first inequality, and at 
another point you will need the second. One more word of advice: since 
the hypotheses only provide information about x — x) and y — yo, it is 
almost a foregone conclusion that the proof will depend upon writing 
xy — xoyo in a way that involves x — x9 and y — yo.) 

*21. Prove that if yo # 0 and 


| { . | εἰ i 
ly — Yo; < min (22, =), 


then y # 0 and 


*22. Replace the question marks in the following statement by expressions 
involving &, xo, and yo so that the conclusion will be true: 


If yo ¥ Ὁ and 
ly oS yo «ἢ and [x = xq! <? 


then y # Ὁ and 


20 Prologue 


20. 


24. 


This problem is trivial in the sense that its solution follows from Problems 
20 and 21 with almost no work at all (notice that x/y = x- 1 /y). The 
crucial point is not to become confused; decide which of the two 
problems should be used first, and don’t panic if your answer looks 
unlikely. 

This problem shows that the actual placement of parentheses in a sum 


is irrelevant. The proofs involve ‘“‘mathematical induction”; if you are 


3 


not familiar with such proofs, but still want to tackle this problem, it can 
be saved until after Chapter 2, where proofs by induction are explained. 
Let us agree, for definiteness, that a} + ᾿Ξ a, will denote 


αι + (a2 + (ag + ® Bei ete (Qn_2 + (an—1 + apn))) ἀν ). 


Thus a: + a2 + a3 denotes αἱ + (ας - 43), and a; - ας ἂς tay - 
denotes αι + (a2 + (a3 + a4)), etc. 


(a) Prove that 
(ἀν τ δι + ay) apes Ξξ αι τ τ᾿ τ ἄχ: 


Hint: Use induction on ἐ. 
(Ὁ) Prove that ifn > 4, then 


(ay eo ae) (ae "τ ἀ,) Ξ αι bee ~ te 
Hint: Use part (a) to give a proof by induction on ἀ. 


(c) Let s(ai, . . . , a) be some sum formed from a, ... , ἀκ. Show 
that 
δία. gute, “ad » ak) =a,+ Pe vs + ak: 
Hint: There must be twosums σ΄ (αι, . . . , ay)ands’(az41, . . « 5 ax) 
such that | 
S(ai,-. . 5a) = S'(ay,.. . yan +5" (ayy, . . . yay). 


Suppose that we interpret ‘““number” to mean either 0 or 1, and + 
and - to be the operations defined by the following two tables. 


+ 0 1 0 1 
0 Oo | 1 0 0 | 0 
1 1 0 1 0 1 


Check that properties P1—P9 all hold, even though 1 + 1 = 0. 


CHAPTER 


NUMBERS OF VARIOUS SORTS 


In Chapter 1 we used the word “number” very loosely, despite our concern 
with the basic properties of numbers. It will now be necessary to distinguish 
carefully various kinds of numbers. 

The simplest numbers are the ‘‘counting numbers” 


hea, ee 


The fundamental significance of this collection of numbers is emphasized by 
its symbol N (for natural numbers). A brief glance at P1—P12 will show that 
our basic properties of ‘‘numbers’’ do not apply to N—for example, ΡΖ and 
P3 do not make sense for N. From this point of view the system N has many 
deficiencies. Nevertheless, N is sufficiently important to deserve several com- 
ments before we consider larger collections of numbers. 

The most basic property of Ν is the principle of “mathematical induction.” 
Suppose P(x) means that the property P holds for the number x. Then the 
principle of mathematical induction states that P(x) is true for all natural 
numbers x provided that 


(1) P(1) is true. 
(2) Whenever P(x) is true, P(&K + 1) is true. 


Note that condition (2) merely asserts the truth of P(k + 1) under the 
assumption that P(x) is true; this suffices to ensure the truth of P(x) for all x, 
if condition (1) also holds. In fact, if P(1) is true, then it follows that P(2) is 
true (by using (2) in the special case k = 1). Now, since P(2) is true it follows 
that P(3) is true (using (2) in the special case k = 2). It is clear that each 
number will eventually be reached by a series of steps of this sort, so that 
P(k) is true for all numbers ἀ. 

A favorite illustration of the reasoning behind mathematical induction 
envisions an infinite line of people, . 


person number 1, person number 2, person number 3, . . . 


If each person has been instructed to tell any secret he hears to the person 
behind him (the one with the next largest number) and a secret is told to’ 


person number 1, then clearly every person will eventually learn the secret. 
If P(x) is the assertion that person number «x will learn the secret, then the 
instructions given (to tell all secrets learned to the next person) assures that 
condition (2) is true, and telling the secret to person number 1 makes (1) true. 
The following example is a less facetious use of mathematical induction. 
There is a useful and striking formula which expresses the sum of the first a 
numbers in a simpletway: 


21 


22 Prologue 


n(n + 1) 


1- eS soy = 
2 


To prove this formula, note first that it is clearly true for n = 1. Now assume 
that for some integer k we have 


ΠΣ 


2 

Then 

ae os thet =O peti 
_ kk +1) + 2k +2 

2 
_ + 3k +2 
2 
_&+DEF2 
2 


so the formula is also true for £ + 1. By the principle of induction this proves 
the formula for all natural numbers n. This particular example illustrates a 
phenomenon that frequently occurs, especially in connection with formulas 
like the one just proved. Although the proof by induction is often quite 
straightforward, the method by which the formula was discovered remains a 
mystery. Problems 4 and 5 indicate how some formulas of this type may be 
derived. 

The principle of mathematical induction may be formulated in an equiv- 
alent way without speaking of “properties” of a number, a term which is 
sufficiently vague to be eschewed in a mathematical discussion. A more precise 
formulation states that if A is any collection (or “‘set’’—a synonymous mathe- 
matical term) of natural numbers and 


(1) 1 isin A, 
(2) k-+ 1 isin A whenever ἃ is in A, 


then A is the set of all natural numbers. It should be clear that this formulation 
adequately replaces the less formal one given previously—we just consider the 
set A of natural numbers x which satisfy P(x). For example, suppose A is the 
set of natural numbers n for which it is true that 


n(n + 1) 


] oc 8 8 — 
ἘΝ ΓΗ 5 


Our previous proof of this formula showed that A contains 1, and that ee ae 
is in A, if k is. It follows that A is the set of all natural numbers, 1.e., that the 
formula holds for all natural numbers n. 

There is yet another rigorous formulation of the principle of mathematical 
induction, which looks quite different. If A is any collection of natural num- 


Numbers of Various Sorts 23 


bers, it is tempting to say that A must have a smallest member. Actually, this 
statement can fail to be true in a rather subtle way. A particularly important 
set of natural numbers is the collection A that contains no natural numbers at 
all, the ‘“‘empty collection” or ‘‘null set,’’* denoted by 9. The null set 9 is a 
collection of natural numbers that has no smallest member—in fact, it has no 
members at all. This is the only possible exception, however; if A is a nonnull 
set of natural numbers, then 4 hasa least member. This “intuitively obvious” 
statement, known as the ‘“‘well-ordering principle,” can be proved from the 
principle of induction as follows. Suppose that the set A has no least member. 
Let B be the set of natural numbers 7 such that 1, . . . , ” are all noé in A. 
Clearly 1 is in B (because if 1 were in A, then A would have 1 as smallest 
member). Moreover, if 1, . .., & are not in A, surely ἀ + 1 is not in A 
(otherwise k + 1 would be the smallest member of 4), 50 1, ...,4-+1are 
all not in A. This shows that if ἀ is in B, then & + 1 is in B. It follows that 
every number ἢ is in B, 1.e., the numbers 1, . . . , » are not in A for any 
natural number n. Thus A = 9, which completes the proof. 

It is also possible to prove the principle of induction from the well-ordering 
principle (Problem 9). Either principle may be considered as a basic assump- 
tion about the natural numbers. 

There is still another form of induction which should be mentioned. It 
sometimes happens that in order to prove P(k + 1) we must assume not only 
P(k), but also P(Z) for all natural numbers / < ἀ. In this case we rely on the 
‘‘principle of complete induction”’: If A is a set of natural numbers and 


(1) isin A, 
(2) k+1isinAifl,...,k arein A, 


then A 15 the set of all natural numbers. 

Although the principle of complete induction may appear much stronger 
than the ordinary principle of induction, it is actually a consequence of that 
principle. The proof of this fact is left to the reader, with a hint (Problem 10). 
Applications will be found in Problems 6, 16, 19, and 20. 

Closely related to proofs by induction are ‘“‘recursive definitions.” For 
example, the number n! (read “ἢ factorial’’) is defined as the product of all the 
natural numbers less than or equal to n: 


i i ee Cie © "η. 
This can be expressed more precisely as follows: 


(1) 1!=41, 
(2) ni=n-(n—1)!. 


This form of the definition exhibits the relationship between n! and (n — 1)! 


* Although it may not strike you as a collection, in the ordinary sense of the word, the null set 
arises quite naturally in many contexts. We frequently consider the set A, consisting of all x 
satisfying Some property P; often we have no guarantee that P is satisfied by any number, so 
that A might be §—in fact often one proves that P is always false by showing that A = 9. 


24 Prologue 


in an explicit way that is ideally suited for proofs by induction. Problem 21 
reviews a definition already familiar to you, which may be expressed more 
succinctly as a recursive definition; as this problem shows, the recursive 
definition is really necessary for a rigorous proof of some of the basic properties 
of the definition. | 

One definition which may not be familiar involves some convenient nota- 
tion which we will constantly be using. Instead of writing 


aa PS eae 


we will usually employ the Greek letter 2 (capital sigma, for “sum”) and 


write 
nr 
> αἱ. 
c=1 
nh 


In other words, > a; denotes the sum of the numbers obtained by letting 
i=l 
PN εν, ae goes Ls 


n 


Yistt2ter tas 


t=1 


n(n + 1} 
2 


Notice that the letter 7 really has nothing to do with the number denoted by 


n 


i, and can be replaced by any convenient symbol (except n, of course!): 


Ι 
" 


n(n + 1) 
2 > 


n=l 


(1) y ας = αι; 
ial 
n n—l1 
(2) 2 a; = > a; + an. 


But only purveyors of mathematical austerity would insist too strongly on 
such precision. In practice, all sorts of modifications of this symbolism are 
used, and no one ever considers it necessary to add any words of explanation. 


Numbers of Various Sorts 25 


The symbol 


n 


Qi, 


t=1 
14 


for example, is an obvious way of writing 


Ota ag ag eg πὸ 


or more precisely, 


3 n 
ας + dj. 
2 2 
The deficiencies of the natural numbers which we discovered at the begin- 
ning of this chapter may be partially remedied by extending this system to the 
set of integers 
is gee oy Oy ies Heeger 


This set is denoted by Z (from German ‘‘Zahl,” number). Of properties 
P1—P12, only P7 fails for Z. 

A still larger system of numbers is obtained by taking quotients m/n of 
integers (with n γέ 0). These numbers are called rational numbers, and the 
set of all rational numbers is denoted by Q (for ‘‘quotients”’). In this system of 
numbers all of P1—P12 are true. It is tempting to conclude that the “‘properties 
of numbers,” which we studied in some detail in Chapter 1, refer to just one 
set of numbers, namely, Q. There is, however, a still larger collection of 
numbers to which properties P1-P12 apply—the set of all real numbers, 
denoted by R. The real numbers include not only the rational numbers, but 
other numbers as well (the irrational numbers) which can be represented 
by infinite decimals; 7 and V 2 are both examples of irrational numbers. The 
proof that 7 is irrational is not easy—we shall devote all of Chapter 16 of 
_Part III to a proof of this fact. The irrationality of V2, on the other hand, is 
quite simple, and was known to the Greeks. (Since the Pythagorean theorem 
shows that an isosceles right triangle, with sides of length 1, has a hypotenuse 
of length we it is not surprising that the Greeks should have investigated this 
question.) The proof depends on a few observations about the natural num- 
bers. Every natural number n can be written either in the form 2k for some 
integer k, or else in the form 2k + 1 for some integer & (this “‘obvious”’ fact has 
a simple proof by induction (Problem 7)). Those natural numbers of the form 
2k are called even; those of the form 24 + 1 are called odd. Note that even 
numbers have even squares, and odd numbers have odd squares: 


(2k)? = 4k2 = 2+ (212), 
(2k + 1)3 = 413 4+ 4k +1 = 2+ (2k? 4+ 21) +1. 


In particular it follows that the converse must also hold: if n? is even, then ἢ is 
even; if n? is odd, then n is odd. The proof that V2 is irrational is now quite 


26 Prologue 


simple. Suppose that V2 were rational; that is, suppose there were natural 
numbers ῥ and 4 such that 


We can assume that p and gq have no common divisor (since all common 
divisors could be divided out to begin with). Now we have 


P _— 2g”. 


This shows that #2 is even, and consequently p must be even; that is, p = 2k 
for some natural number ἀ. ‘Then 


p? — 4:2 — ἢ; 


SO 
2k? = φ΄. 


This shows that g? is even, and consequently that q is even. Thus both p and q 
are even, contradicting the fact that p and 4 have no common divisor. This 
contradiction completes the proof. 

It is important to understand precisely what this proof shows. We have 
demonstrated that there is no rational number x such that x? = 2. This asser- 
tion is often expressed more briefly by saying that V2 is irrational. Note, 
however, that the use of the symbol V2 implies the existence of some number 
(necessarily irrational) whose square is 2. We have not proved that such a 
number exists and we can assert confidently that, at present, a proof is 
impossible for us. Any proof at this stage would have to be based on ΡΊ -ΡΊ 2 
(the only properties of R we have mentioned); since P1-P12 are also true for 
Q the exact same argument would show that there is a rational number whose 
square is 2, and this we know is false. (Note that the reverse argument will 
not work—our proof that there is no rational number whose square is 2 cannot 
be used to show that there is no real number whose square is 2, because our 
proof used not only P1—P12 but also a special property of Q, the fact that every 
number in Q can be written p/g for integers p and q.) 

This particular deficiency in our list of properties of the real numbers could, 
of course, be corrected by adding a new property which asserts the existence 
of square roots of positive numbers. Resorting to such a measure is, however, 
neither aesthetically pleasing nor mathematically satisfactory; we would still 
not know that every number has an nth root if n is odd, and that every positive 
number has an nth root if n is even. Even if we assumed this, we could not 
prove the existence of a number x satisfying x° +x*« +1 = 0 (even though 
there does happen to be one), since we do not know how to write the solution 
of the equation in terms of nth roots (in fact, it is known that the solution 
cannot be written in this form). And, of course, we certainly do not wish to 
assume that all equations have solutions, since this is false (no real number x 
satisfies x2 + 1 = 0, for example). In fact, this direction of investigation 15 
not a fruitful one. The most useful hints about the property distinguishing R 
from Q, the most compelling evidence for the necessity of elucidating this 


Numbers of Various Sorts 27 


property, do not come from the study of numbers alone. In order to study the 
properties of the real numbers in a more profound way, we must study more 
than the real numbers. At this point we must begin with the foundations of 
calculus, in particular the fundamental concept on which calculus is based— 
functions. 


PROBLEMS 


1. 


2. 


Prove the following formulas by induction. 


1 se es eae Ξ pee) Oe 
6 


(ii) 134+ -.-- +ne=(1 +--+: +n)% 


Find a formula for 


᾽ --) Ξ1353 5: Ὁ: +(2n—1). 


os 

| ed 

“Nee” 
ΤΊ: 


- 
li 
— 


Gir) CES ate sta sha Shamed) 
i=1 
Hint: What do these expressions have todo with 1 +2+3-+ °°: + 
2a Ae 2 Ber ee eae 
IfO «αὶ <n, the ‘“‘binomial coefficient” 4 is defined by 
! ay ΤΑΣ ὦ. ἃς Ὁ ἢ 53 

[) _ n! _ n(n 1) (n ἜΣ ΩΝ Σ if k 0,1 

k κί (αὶ -- k)! Kk! 
(0) ΞΞ (") = 1. (This becomes a special case of the first formula if we 

0 n 

define 0! = 1.) 


(a) Prove that 
n+1\ _ n πὶ, 
Cr) τῷ 
(The proof does not require an induction argument.) 


This relation gives rise to the following configuration, known as 
‘‘Pascal’s triangle’ —a number not on one of the sides is the sum of 


the two numbers above it; the binomial coefficient ( Ἵ is the Ath 


number in the nth row. 


28 Prologue 


(Ὁ) Notice that all the numbers in Pascal’s triangle are natural num- 
n 
k 
number. (Your entire proof by induction will, in a sense, be summed 
up in a glance by Pascal’s triangle.) 


bers. Use part (a) to prove by induction that ( ) is always a natural 


n 


(c " 


᾿ς, 


Give another proof that ( ) is a natural number by showing that 


(;.) is the number of sets of exactly k integers each chosen from 


(Ὁ 
(4) Prove the “binomial theorem’’: If a and ὁ are any numbers and nis 
a natural number, then 


a” + (‘' ΓΒ: +(5) a” *b2 + ++ +( ὁ ) abe + b” 
1 2 n— 1 


> es a” Ibs, 
J 


7250 


(a + δὴ)" 


(e) Prove that 


© Y()-G)+--+Q-* 
0 Σεν τ τῶν. :0 τὸ 


4. (a) Prove by induction on n that 


—_ γ 1 


itrtr’t no κπ 
ee 2 
ifr ~ 1 (ifr = 1, evaluating the sum certainly presents no problem). 
(b) Derive this result by setting S=1+7r+--- τῆ, multiplying 
this equation by r, and solving the two equations for S. 
5. The formula for 12+ - - - +? may be derived as follows. We begin 
with the formula 


(A+ 1)? — kB = 3k? + 3k +1. 
Writing this formula for k = 1, . . . , n and adding, we obtain 


2?— 1% =3-17+3-1+1 
ee De ey ΣΙ ee i a | 


(n+1)§? —n® =3°n? +3-n+1 
(η - 1)5 -- 1 = 315- τ. +n74+ 30 + τ τ᾿ πη] Ἐπ. 


Numbers of Various Sorts 29 


Thus we can find 3 k? if we already know Σ k (which could have been 
k=1 k=1 


found in a similar way). Use this method to find 


ἘΣ τ δ. 
(ji) 14-Ὁ τ: Ἐπ’ 
he. τἡ Ί 1 
Lee Real π᾿ 
: 2 5 Ane I 
joes es Bs mS » 8 6 age eS 
(iv) {2-92-9232 + n2(n + 1)? 


ἀξ, Use the method of Problem 5 to show that ) 4? can always be written 
: k=1 
in the form 


neti 


pe μι, δὲ: ὩΣ 


(The first 10 such expressions are 


nr 
> k = 4n? + $n 
k= 
nN 
ἐς ᾧ 1 
Σ 45 = fn + gn? + ¢n 
k=1 
nr 
τὸ ἢ 1 1 
> k8 = 453 Ἢ an +- gn? 
k=1 
Tr 
t 1,5 1 1 1 
> ke = n° + 5ni Ἢ an — zon 
k=1 
nr 
> ke = 4n6 + 4n® + τὅπ' — yon? 
k=1 
Nr 
Yk = an? + nt tnd = dnt + de 
k=1 
Th 
Ἴ.. 1 17 ¥7 5 1 
) Κἴ = ξηδ - δηῖ Ὁ χη -- οαπ' + ran? 
k=1 
nr 
PS. a 1 1 2 ἢ 7.8 2 ΠΝ Ξ 
δ Rage Ὁ ἀπὸ gn sagt Ὁ πὴ -΄ 90" 
k=1 
nr 
ae | 1 1 1 1 
Σ 9 ΞΞΞ on” + an? + 378 — zon + an mr τη 
k=1 
Tr 
Σ ε10 = ἄτη + 2710 4 ὅη9. - ini - 1η -- ἐπ + Gen. 


= 
ll 
μ 


Notice that the coefficients in the second column are always ὦ, and that 
after the third column the powers of 7 with nonzero coefficients decrease 


30 = Prologue 


10. 


11. 


12. 


*13. 


14, 


15. 


*16. 


by 2 until n? or n is reached. The coefficients in all but the first two 

columns seem to be rather haphazard, but there actually is some sort of 

pattern; finding it may be regarded as a super-perspicacity test. See 

Problem 26-16 for the complete story.) 

Prove that every natural number is either even or odd. 

Prove that if a set A of natural numbers contains mp) and contains k + 1 

whenever it contains 4, then A contains all natural numbers > No. 

Prove the principle of mathematical induction from the well-ordering 

principle. 

Prove the principle of complete induction from the ordinary principle of 

induction. Hint: If A contains 1 and A contains n + 1 whenever it con- 

tains 1, . . . , 7, consider the set B of all k such that 1, ... , & are all 

in A. 

(a) If ais rational and ὁ is irrational, is a + 6 necessarily irrational? 
What if a and ὁ are both irrational? 

(b) If a is rational and ὁ is irrational, is ab necessarily irrational? 
(Careful!) 

(c) Is there a number a such that a? is irrational, but a‘ is rational? 
(d) Are there two irrational numbers whose sum and product are both 
rational? 7 
(a) Prove that V3, V5, and V 6 are irrational. Hint: To treat V3, for 
example, use the fact that every integer is of the form 3n or 3n + 1 

or 3n + 2. Why doesn’t this proof work for V4? 


(b) Prove that V2 and V3 are irrational. 
Prove that 


(a) v2 + V3 is irrational. 
(b) V6 — V2 — V3 is irrational. 


(a) Prove that if x = p + Vq where p and q are rational, and m is a 
natural number, then x” = a + ὁ V9 for some rational a and ὁ. 
(b) Prove also that (p — νῷ)" τ α - ὁ V9. 
(a) Prove that if m and n are natural numbers and m?/n? < 2, then 
(m + 2n)?/ (m+n)? > 2; show, moreover, that 
(m + 2n)? —-2<2- m* 
(γι + n)? n® 
(b) Prove the same results with all inequality signs reversed. 
(c) Prove that if m/n < ἀν). then there is another rational number 
m'/n’ with m/n < m'/n! < V2. 
It seems likely that Vn is irrational whenever the natural number n is 
not the square of another natural number. Although the method of 
Problem 12 may actually be used to treat any particular case, it is not 


Ἐ17. 


18. 


19. 


Numbers of Various Sorts 31. 


clear in advance that it will always work, and a proof for the general 
case requires .ome extra information. A natural number / is called a 
prime number if it is impossible to write = αὖ for natural numbers 
a and ὁ unless one of these is p, and the other 1; for convenience we also 
agree that 1 is not a prime number. The first few prime numbers are 2, 
3, 5, 7, 11, 13, 17, 19. Ifn > 1 is not a prime, then πὶ = αὖ, with a and 6 
both < n; if either a or ὦ is not a prime it can be factored similarly; con- 
tinuing in this way proves that we can write n as a product of primes. 
For example, 28 = 4.7 = 2:2: 7. 


(a) Turn this argument into a rigorous proof by complete induction. 
(To be sure, any reasonable mathematician would accept the 
informal argument, but this is partly because it would be obvious 
to him how to state it rigorously.) 


A fundamental theorem about integers, which we will not prove here, 
states that this factorization is unique, except for the order of the factors. 
Thus, for example, 28 can never be written as a product of primes one 
of which is 3, nor can it be written in a way that involves 2 only once 
(now you should appreciate why 1 is not allowed as a prime). 


(b) Using this fact, prove that Vn is irrational unless n = m? for some 
natural number m. 

(c) Prove more generally that Wn is irrational unless n = m*. 

(d) No discussion of prime numbers should fail to allude to Euclid’s 
beautiful proof that there are infinitely many of them. Prove that 
there cannot be only finitely many prime numbers fj, f2, f3, - - - 0 


fn by considering pi fo... 3. +1. 
(a) Prove that if x satisfies 
x” a,x") + es gs oe + ay -Ξ 0, 
for some integers d,_1, - - - , @o, then x is irrational unless x is an 


integer. (Why is this a generalization of Problem 16?) 
(b) Prove that V2 + V2 is irrational. Hint: Start by working out the 
first 6 powers of this number. 
Prove Bernoulli’s inequality: Ifh > —1, then 
(1 +h)” > 1 Ἡ nf. 


Why is this trivial if ἃ > 0? 


The Fibonacci sequence ai, a2, a3, . . . is defined as follows: 
ai = 1, 
ag = 1, 
Qn = An—1 + Gn—2 for n > 3. 
This sequence, which begins 1, 1. 2, 3, 5, 8, . . . , was discovered by 


Fibonacci (circa 1175-1250), in connection with a problem about 


32 Prologue 


20." 


21, 


rabbits. Fibonacci assumed that an initial pair of rabbits gave birth to 
one new pair of rabbits per month, and that after two months each new 
pair behaved similarly. The number a, of pairs born in the nth month is 
Qn—1 + Qn—2, because a pair of rabbits is born for each pair born the 
previous month, and moreover each pair born two months ago now 
gives birth to another pair. The number of interesting results about this 
sequence is truly amazing—there is even a Fibonacci Association which 
publishes a journal, The Fibonacci Quarterly. Prove that 


[5.79 es ent : 
2 2 
an ΞΞ - -----ς-ς- "08 ee 
V5 

One way of deriving this astonishing formula is presented in Problem 
23-8. 
The result in Problem 1-6 has an important generalization: Ifa, ... , 
an, = 0, then 


ἜΣ ΞΘ πὶ 
1 eo ΝΣ n= ᾿ 
(a) Why is this true if a} = - - + = a,? Suppose not all ας are equal, 


say a; #a,;. If a; and a; are both replaced by (a; + 4;)/2 what 
happens to the “arithmetic mean” A, = (αι + °° * + 4n)/n? 
What happens to the “geometric mean” G, = ναι"... "4,9 
Why does repeating this process enough times eventually prove 
that G, < 4,3 (This is another place where it is a good exercise to 
provide a formal proof by induction, as well as informal reason.) 


The reasoning in the previous proof is closely related to another interest- 
ing proof. 


(b) Using the fact that G, < A, when n = 2, prove, by induction on 
k, that G, < A, for n = 2°. 
(c) For a general n, let 2" > n. Apply part (b) to the 2” numbers 


Qi, ° e Φ 5 an; An, ° . .-» 3 An 


27—n times 


to prove that G, < An. 


The following is a recursive definition of a”: 


a’ = 4, 
Gt = a" a, 

Prove, by induction, that 
αἴ. = αἴ. a”. 


(a) _ gn. 


22. 


23. 


Numbers of Various Sorts 33 


(Don’t try to be fancy: use either induction on n or induction on m, not 
both at once.) 

Suppose we know properties P1 and P4 for the natural numbers, but 
that multiplication has never been mentioned. Then the following can 
be used as a recursive definition of multiplication: 


1:-b= 5, 
(α -ἰ 1). Ξ α΄ Ὁ ὁ. 


Prove the following (in the order suggested [): 


a: (6 +c) Ξ α᾿ ὃ -Ῥ ac (use induction on a), 
a‘1l =a, 
a:b = b+ a (you just finished proving the case ὁ = 1). 


In this chapter we began with the natural numbers and gradually built 
up to the real numbers. A completely rigorous discussion of this process 
requires a little book in itself (see Part V). No one has ever figured out 
how to get to the real numbers without going through this process, but 
if we do accept the real numbers as given, then the natural numbers can 
be defined as the real numbers of the form 1, 1 + 1,1 +1 + 1, etc. The 
whole point of this problem is to show that there is a rigorous mathe- 
matical way of saying “‘etc.” 


(a) A set A of real numbers is called inductive if 


(1) 1 isin A, 
(2) k +1 isin A whenever ἀ is in A. 


Prove that 


(i) R is inductive. 

(ii) The set of positive real numbers is inductive. 

(iii) The set of positive real numbers unequal to + is inductive. 

(iv) The set of positive real numbers unequal to 5 is not inductive. 

(v) If A and B are inductive, then the set C of real numbers which 
are in both A and B is also inductive. 


(Ὁ) A real number n will be called a natural number if 7 is in every 
inductive set. 


(i) Prove that 1 is a natural number. 
(ii) Prove that ἃ + 1 is a natural number if ἀ is a natural number. 


rant θὲ 


FOUNDATIONS 


The statement 1s so frequently made 
that the differential calculus deals with 
continuous magnitude, and yet 

an explanation of this continuity 15 
nowhere given ; : 

even the most rigorous expositions 

of the differential calculus do not base 
their proofs upon continuity but, 

with more or less consciousness of the fact, 
they either appeal to geometric notions 
or those suggested by geometry, 

or depend upon theorems which are never 
established in a purely arithmetic manner. 
Among these, for example, 

belongs the above-mentioned theorem, 
and a more careful investigation 
convinced me that this theorem, or 

any one equivalent to it, can be regarded 
in some way as a sufficient basis 

for infinitesimal analysis. 

It then only remained to discover its true 
origin in the elements of arithmetic 

and thus at the same time 

to secure a real definition of 

the essence of continurty. 

1 succeeded Nov. 24, 1858, and 

a few days afterward I communicated 
the results 

of my meditations to my dear friend 
Durége with whom I had a long 

and lively discussion. 


RICHARD DEDEKIND 


CHAPTER 


PROVISIONAL DEFINITION 


FUNCTIONS 


Undoubtedly the most important concept in all of mathematics is that of a 
function-—in almost every branch of modern mathematics functions turn out 
to be the central objects of investigation. It will therefore probably not 
surprise you to learn that the concept of a function is one of great generality. 
Perhaps it will be a relief to learn that, for the present, we will be able to 
restrict our attention to functions of a very special kind; even this small class 
of functions will exhibit sufficient variety to engage our attention for quite 
some time. We will not even begin with a proper definition. For the moment 
a provisional definition will enable us to discuss functions at length, and will 
illustrate the intuitive notion of functions, as understood by mathematicians. 
Later, we will consider and discuss the advantages of the modern mathe- 
matical definition. Let us therefore begin with the following: 


A function is a rule which assigns, to each of certain real numbers, some other 
real number. 


The following examples of functions are meant to illustrate and amplify this 
definition, which, admittedly, requires some such clarification. 


Example 1 The rule which assigns to each number the square of that 
number. 
Example 2 The rule which assigns to each number y the number 


yt sy +2, 
y+ i 


Example 3 The rule which assigns to each number c γέ 1, —1 the number 


Εν 58:5} :9: 
ἐ Ξ 
Example 4 The rule which assigns to each number x satisfying —17 < 
x < π΄3 the number x’. 
Example 5 The rule which assigns to each number a the number 0 if a 15 


irrational, and the number 1 if a is rational. 
Example 6 The rule which assigns 


to 2 the number 5, 


36 
to 17 the number —> 
T 


37 


38 Foundations 


ae 
to ΤΩ the number 28, 


to 2 the number 28, 
T 


and to any y # 2, 17, 7°/17, or 36/m, the number 16 if y is of the forma + 
b V2 for a, ὁ in Q, 

Example 7 The rule which assigns to each number ¢ the number 68 + x. 
(This rule depends, of course, on what the number x is, so we are really 
describing infinitely many different functions, one for each number +.) 

Example δ The rule which assigns to each number z the number of 7’s in 
the decimal expansion of z, if this number is finite, and —7 if there are 
infinitely many 7’s in the decimal expansion of z. 


One thing should be abundantly clear from these examples—a function is 
any rule that assigns numbers to certain other numbers, not just a rule which 
can be expressed by an algebraic formula, or even by one uniform condition 
which applies to every number; nor is it necessarily a rule which you, or any- 
body else, can actually apply in practice (no one knows, for example, what 
rule 8 associates to 7). Moreover, the rule may neglect some numbers and it 
may not even be clear to which numbers the function applies (try to deter- 
mine, for example, whether the function in Example 6 applies to 7). The set of 
numbers to which a function does apply is called the domain of the function. 

Before saying anything else about functions we badly need some notation. 
Since throughout this book we shall frequently be talking about functions 
(indeed we shall hardly ever talk about anything else) we need a convenient 
way of naming functions, and of referring to functions in general. The standard 
practice is to denote a function by a letter. For obvious reasons the letter ‘‘f”’ 
is a favorite, thereby making “σ᾽ and “Δ᾽ other obvious candidates, but any 
letter (or any reasonable symbol, for that matter) will do, not excluding ‘‘x”’ 
and “‘y,”’ although these letters are usually reserved for indicating numbers. 
If fis a function, then the number which f associates to a number x is denoted 
by f(x)—this symbol is read “‘f of x”? and is often called the value of f at x. 
Naturally, if we denote a function by x, some other letter must be chosen to 
denote the number (a perfectly legitimate, though perverse, choice would be 
“7,” leading to the symbol x(f)). Note that the symbol f(x) makes sense only 
for x in the domain of f; for other x the symbol f(x) is not defined. 

If the functions defined in Examples 1-8 are denoted by f, g, h, 7, 5, 9, ας; 
and y, then we can rewrite their definitions as follows: 


(1) f(x) = x? for all x. 


» 01 3) Ὁ5 
ὃς = ——_——— for all γ. 
(2) g(y) 5 ΕἸ rally 
3 
(3) Ah(c) = “τ for alle # 1, —1. 
cae 


Functions 39 


(4) r(x) = x? for all x such that —17 < x < 7/3. 
0, «x irrational 

5 = 

ys 500) 1, x rational. 


5, *« =2 
36 
— x =17 
τ 
πτϑ 
28 -- --- 
(6) θ0)ϑ -{(“ 5 17 
28, x= = 
via 


2 26 = 
162, ΞΕ 2 ἢ: = or —, See 72 for a, binQ. 
T 


(7) a,(t) = t+ x for all numbers ¢. 
ΠῚ" N, exactly n 7’s appear in the decimal expansion of x 
? —7, infinitely many 7’s appear in the decimal expansion of x. 


These definitions illustrate the common procedure adopted for defining a 
function f—indicating what f(x) is for every number x in the domain of f. 
(Notice that this is exactly the same as indicating f(a) for every number a, or 
(0) for every number ὁ, etc.) In practice, certain abbreviations are tolerated. 
Definition (1) could be written simply 


(1) f(x) Ξ χα", 


the qualifying phrase “‘for all x’? being understood. Of course, for definition 
(4) the only possible abbreviation is 


Gy αἰ Se. 1 a ee 1}3. 


It is usually understood that a definition such as 
1 1 
k(x) =--+—~ x #0,1 
x eed 


can be shortened to 


ee 


x x—-L 
in other words, unless the domain is explicitly restricted further, it ts understood to 


consist of all numbers for which the definition makes any sense at all. 


You should have little difficulty checking the following assertions about the 
functions defined above: 


fla + 1) = f(x) + 2x + J; 
O(n) = Ae) χ =P oe =; 
roe +) = r(x) + 2x4 Lif 17 Se 55-1 


40 Foundations 


s(x + y) = s(x) if y is rational; 


()- ὦ 


ας(Χ) = x (f(x) + 1]; 


Ὅτ-ὧθ- τ 


If the expression f(s(a)) looks unreasonable to you, then you are forgetting 
that s(a) is a number like any other number, so that f(s(a)) makes sense. As 
a matter of fact, f(s(a)) = s(a) for all a. Why? Even more complicated expres- 
sions than f(s(a)) are, after a first exposure, no more difficult to unravel. The 
expression 


f(r(s((a@es(¥(B))))))s 


formidable as it appears, may be evaluated quite easily with a little patience: 


f(r(s((a@a(9($)))))) 
= f(r(s(@(as(0))))) 
= f(r(s(8(3)))) 
= f(r(s(16))) 
=r) 
=o) 
= 1. 
The first few problems at the end of this chapter give further practice manipu- 
lating this symbolism. 
The function defined in (1) is a rather special example of an extremely 
important class of functions, the polynomial functions. A function fis a 
polynomial function if there are real numbers ao, . - - , ἄμ such that 


f(x) = anx” tank” + + fax? +axtao, forall x 


(when f(x) is written in this form it is usually tacitly assumed that an # 0). 
The highest power of x with a nonzero coefficient is called the degree of i; 
for example, the polynomial function f defined by f(x) = 5x8 + 137x+ — 7 
has degree 6. 

The functions defined in (2) and (3) belong to a somewhat larger class of 
functions, the rational functions; these are the functions of the form b/4 
where f and 4 are polynomial functions (and g is not the function which is 
always 0). The rational functions are themselves quite special examples of an 
even larger class of functions, very thoroughly studied in calculus, which are 
simpler than many of the functions first mentioned in this chapter. The 
following are examples of this kind of function: 


x + x? + x sin? x 
9 το I τὶ 
O) fe) x sin x + x sin? x 


(10) f(x) = sin(x’). 
(11) f(x) = sin(sin(x?)). 


Functions 41 


x + sin(x sin “), 


x + sin x 


(12) f(x) = sin*(sin(sin?(x sin? x?))) - sin (5: 


By what criterion, you may feel impelled to ask, can such functions, especially 
a monstrosity like (12), be considered simple? The answer is that they can be 
built up from a few simple functions using a few simple means of combining 
functions. In order to construct the functions (9)-(12) we need to start with 
the ‘“‘identity function’ 7, for which [(x) = x, and the “‘sine function” sin, 
whose value sin(x) at x is often written simply sin x. The following are some 
of the important ways in which functions may be combined to produce new 
functions. 

If f and g are any two functions, we can define a new function f + g, called 
the sum of f and g, by the equation 


(f + g)(x) = f(x) + g(x). 


Note that according to the conventions we have adopted, the domain of f + g 
consists of all x for which ‘‘f(x) + g(x)’’ makes sense, i.e., the set of all x in both 
domain f and domain g. IfA and B are any two sets, then A (1) B (read “‘A inter- 
sect B” or “‘the intersection of A and 8) denotes the set of x in both A and 8B; | 
this notation allows us to write domain (f + g) = domain f/\ domain g. 


In a similar vein, we define the product Γ᾿ g and the quotient f (or f/g) 
g 


of f and g by 
(f° a(x) = fle) 46) 


(4) = Se 


Moreover, if g is a function and ς is a number, we define a new function c " g by 


and 


(c° g)(x) = εὐ g(x). 


This becomes a special case of the notation Γ᾿ g if we agree that the symbol c 
should also represent the function f defined by f(x) = c; such a function, 
which has the same value for all numbers x, is called a constant function. 
The domain of f: g is domain f (domain g, and the domain of σ᾽ g 15 
simply the domain of g. On the other hand, the domain of f/g is rather com- 
plicated—it may be written domain f/ domain gM {x: g(x) # 0}, the 
symbol {x: g(x) ~ 0} denoting the set of numbers x such that g(x) # 0. In 
general, {x: . . .} denotes the set of all x such that “. . .” is true. Thus 
{x: χ + 3 < 11} denotes the set of all numbers x such that x* < ὃ, and 
consequently {x: x3 +3 < 11} = {x: x < 2}. Either of these symbols could 
just as well have been written using y everywhere instead of x. Variations of 
this notation are common, but hardly require any discussion. Any one can 
guess that {x > 0: x3 < 8} denotes the set of positive numbers whose cube is 
less than 8; it could be expressed more formally as {x: x > 0 and x’ < 8}. 


42 Foundations 


Incidentally, this set is equal to the set {x: 0 <x « 2}. One variation is 
slightly less transparent, but very standard. The set {1, 3, 2, 4}, for example, 
contains just the four numbers 1, 2, 3, and 4; it can also be denoted by 
ἰχ: χ = Οὐχ =3o0rxe =2orx = 4}. | 

Certain facts about the sum, product, and quotient of functions are obvious 
consequences of facts about sums, products, and quotients of numbers. For 
example, it is very easy to prove that 


(f+g) +th=ft(gth). 


The proof is characteristic of almost every proof which demonstrates that two 
functions are equal—the two functions must be shown to have the same domain, 
and the same value at any number in the domain. For example, to prove that 
(f+e) +AhA=f+(g +A), note that unraveling the definition of the two 
sides gives 
[(Γ + g) + 116) = (f + g)(x) + A) 
= ὦ + g()] + Ae) 
and 
Lf + (g + AG) =f@) + Ce + 4)@) 
= f(x) + [g(x) + A(x)], 


and the equality of [f(x) + g(x)] + A(x) and f(x) + [g(x) + A(x)] is a fact 
about numbers. In this proof the equality of the two domains was not explicitly 
mentioned because this is obvious, as soon as we begin to write down these 
equations; the domain of (f + g) + ἡ and of f + (g + A) is clearly domain 
f (domain g (domain ἡ. We naturally write f + g +h for (f +g) + 
h = f+ (g +A), precisely as we did for numbers. | 

It is just as easy to prove that (f-g):h = f+ (g*h), and this function is 
denoted by f- σ᾽ Δ. The equations f + g = g + fandf:g = σ΄ fshould also 
present no difficulty. 

Using the operations +, :, / we can now express the function f defined in 
(9) by 
71 1:1-Ῥ 1’ 510" sin 

1 510 7: 510" sin 


Ἢ 


It should be clear, however, that we cannot express function (10) this way. 
We require yet another way of combining functions. ‘This combination, the 
composition of two functions, is by far the most important. 

If f and g are any two functions, we define a new function f° g, the com- 
position of f and g, by 


(fo g)(x) = flg(x)); 


the domain of fog is {x: x is in domain g and g(x) is in domain f}. The 
symbol “fog” is often read “‘f circle g.”” Compared to the phrase “τῆς com- 
position of f and g” this has the advantage of brevity, of course, but there 1s 
another advantage of far greater import: there is much less chance of confusing 


Functions 43 


feg with gof, and these must not be confused, since they are not usually 
equal; in fact, almost any f and g chosen at random will illustrate this point 
(try f = 7:1 and g = sin, for example). Lest you become too apprehensive 
about the operation of composition, let us hasten to point out that composition 
15 associative: 


(722) eh foe oh) 
(and the proof is a triviality); this function is denoted by [9 g oh. We can now 
write the functions (10), (11), (12) as 


(10) f=sineo(/-J), 
(11) f =sinosino (J- J), 
(12) f = (sin: sin) osin ο (sin: sin) o (7 [(sin* sin) ο (J: J)]) ° 
( + sino (> mee 
sin o { —_—_—_-_—_—— } 
I + sin 


One fact has probably already become clear. Although this method of writing 
functions reveals their ‘‘structure’”’ very clearly, it is hardly short or con- 
venient. The shortest name for the function f such that f(x) = sin(x’) for all 
x unfortunately seems to be “the function f such that f(x) = sin(x’) for 
all x.”? The need for abbreviating this clumsy description has been clear for 
two hundred years, but no reasonable abbreviation has received universal 
acclaim. At present the strongest contender for this honor is something like 


x — sin(x”) 


(read ‘‘x goes to sin(x*)”? or just ‘‘x arrow sin(x?)”), but it is hardly popular 
among writers of calculus textbooks. In this book we will tolerate a certain 
amount of ellipsis, and speak of “τῆς function f(x) = sin(x’).’’ Even more 
popular is the quite drastic abbreviation: “τῆς function sin(x’).”’ For the sake 
of precision we will never use this description, which, strictly speaking, con- 
fuses a number and a function, but it is so convenient that you will probably 
end up adopting it for personal use. As with any convention, utility is the 
motivating factor, and this criterion is reasonable so long as the slight logical 
deficiencies cause no confusion. On occasion, confusion wz// arise unless a 
more precise description is used. For example, “‘the function χα + ἐδ is an 
ambiguous phrase; it could mean either 


x—x + 2, i.e., the function f such that f(x) = x + ἐδ for all x 
or 
t—x + #3, ie., the function f such that f(t) = x + # for all ¢. 


As we shall see, however, for many important concepts associated with func- 
tions, calculus has a notation which contains the ‘‘x -- built in. 

By now we have made a sufficiently extensive investigation of functions to 
warrant reconsidering our definition. We have defined a function as a “rule,” 
but it is hardly clear what this means. If we ask ‘‘What happens if you break 
this rule?” it is not easy to say whether this question is merely facetious or 


44 Foundations 


actually profound. A more substantial objection to the use of the word “rule”’ 
is that 


I 
δὲ 
ἐν 


f(x) : 
and 


f(x) = χ + 3x +3 -- 3(. + 1) 


are certainly different rules, if by a rule we mean the actual instructions given 
for determining f(x); nevertheless, we want 


f(x) 


x? 
and 


f(x) = x? + 3x +3 -—-30 - 1) 


to define the same function. For this reason, a function is sometimes defined 
as an “‘association”’ between numbers; unfortunately the word ‘‘association”’ 
escapes the objections raised against ‘“‘rule’? only because it is even more 
vague. 

There is, of course, a satisfactory way of defining functions, or we should 
never have gone to the trouble of criticizing our original definition. But a 
satisfactory definition can never be constructed by finding synonyms for 
English words which are troublesome. The definition which mathematicians 
have finally accepted for ‘“‘function”’ is a beautiful example of the means by 
which intuitive ideas have been incorporated into rigorous mathematics. The 
correct question to ask about a function is not ‘‘What is ἃ rule?”’ or ‘What is 
an association?” but ‘“‘What does one have to know about a function in 
order to know all about it?” The answer to the last question is easy—for each 
number x one needs to know the number f(x); we can imagine a table which 
would display all the information one could desire about the function 


fa) = x: 


x f{x) 

1 1 

—1 1 

2 4 

ΞΡ 4 
ν2 2 
WD 2 
v T? 

—T 1? 


It is not even necessary to arrange the numbers in a table (which would 
actually be impossible if we wanted to list all of them). Instead of a two 
column array we can consider various pairs of numbers 


(1, 1), (—1, 1), 2, 4), (~2, 4), @, #9, (V2, 2),..-. 


DEFINITION 


DEFINITION 


Functions 45 


simply collected together into a set.* To find f(1) we simply take the second 
number of the pair whose first member is 1; to find f(7) we take the second 
number of the pair whose first member is 7. We seem to be saying that a func- 
tion might as well be defined as a collection of pairs of numbers. For example, 
if we were given the following collection (which contains just 5 pairs): 


f = τ (3, 7), (5, 3), (4, 8), (8, 44; 
then f(1) = 7, f(3) = 7, f(5) = 3, f(4) = 8, f(8) = 4 and 1, 3, 4, 5, 8 are the 
only numbers in the domain of f. If we consider the collection 

7 _ { (1, vs (2: ΤᾺ (2, 5), (1, 8), (8, 4)}, 


then f(3) = 7, f(2) = 5, f(8) = 4; but it is impossible to decide whether 
f(1) = 7 or f(1) = 8. In other words, a function cannot be defined to be any 
old collection of pairs of numbers; we must rule out the possibility which 
arose in this case. We are therefore led to the following definition. 


A function is a collection of pairs of numbers with the following property: 


if (a, 6) and (a, c) are both in the collection, then ὁ = c; in other words, the 
collection must not contain two different pairs with the same first element. 


This is our first full-fledged definition, and illustrates the format we shall 
always use to define significant new concepts. These definitions are so impor- 
tant (at least as important as theorems) that it is essential to know when one 
is actually at hand, and to distinguish them from comments, motivating 
remarks, and casual explanations. They will be preceded by the word 
DEFINITION, contain the term being defined in boldface letters, and con- 
stitute a paragraph unto themselves. 

There is one more definition (actually defining two things at once) which 
can now be made rigorously: 


If f is a function, the domain of f is the set of all a for which there is some b 
such that (a, 5) is in f. If a is in the domain of f, it follows from the definition 
of a function that there is, in fact, a unique number ὁ such that (a, δ) is in f. 
This unique ὦ is denoted by f(a). 


With this definition we have reached our goal: the important thing about 
a function f is that a number f(x) is determined for each number x in its 
domain. You may feel that we have also reached the point where an intuitive 
definition has been replaced by an abstraction with which the mind can 
hardly grapple. Two consolations may be offered. First, although a function 


* The pairs occurring here are often called “ordered pairs,” to emphasize that, for example, 
(2, 4) is not the same pair as (4, 2). It is only fair to warn that we are going to define functions 
in terms of ordered pairs, another undefined term. Ordered pairs can be defined, however, 
and an appendix to this chapter has been provided for skeptics. 


46 Foundations 


has been defined as a collection of pairs, there is nothing to stop you from 
thinking of a function as a rule. Second, neither the intuitive nor the formal 
definition indicates the best way of thinking about functions. The best way 
is to draw pictures; but this requires a chapter all by itself. 


PROBLEMS 
1. Let f(x) = 1/(4 + x). What is 
(i) f(f(x)) (for which x does this make sense?). 


0 1() 


(iii) f(cx). 

(iv) f(x +). 

(ν) fOr) EO): 

(vi) For which numbers ¢ is there a number x such that f(cx) = f(x). 
Hint: There are a lot more than you might think at first glance. 

(vii) For which numbers ¢ is it true that f(cv) = f(x) for two different 
numbers x? 


2. Let g(x) = x?, and let 
0, x rational 
(es 1, «x irrational. 

(i) For which y is A(y) < y? 

(i) For which y is h(y) < e(y)? 

(111) What is g(h(z)) — A(z)? 

(iv) For which w is g(w) < w? 

(v) For which ¢ is g(g(¢)) = g(e)? 
3. Find the domain of the functions defined by the following formulas. 


Gi) fx) =V1— x 
Gi) fo) ξεν eV ree. 
I . 
nD 
(iv) f(x) = V1 — χ + Vx? — 1, 
(vy) fix) =V1—x+ Vx -- 2. 
4. Let S(x) = x*, let P(x) = 27, and let s(x) = sin x. Find each of the 
following. In each case your answer should be a number. 


(i) (So P)(y). 
(ii) (S'° s)(y). 
(11) (So Pos)(t) + (so P)(Z). 
(iv) «(5). 
5. Express each of the following functions in terms of S, P, s, using only 
+, “, and o (for example, the answer to (i) is Pos). In each case your 


(iii) f(x) = — - 


Functions 47 


answer should be a functzon. 


a) fix = 2" 

Gi) f@) = sin 25. 

(iii) f(x) = sin x?. 

(iv) f(x) = sin?x (remember that sin?x is an abbreviation for 
(sin x)?). 

(v) f(t) = 2%. (Note: a® always means a; this convention is adopted 
because (a®)* can be written more simply as ας.) 

(vi) f(u) = sin(2% + 2’). 

(vii) f(y) = sin(sin(sin(2?"""))). 

(viii) f(a) = Qsin?a sin (a?) a Jsin(a?+sin a) | 


Polynomial functions, because they are simple, yet flexible, occupy a favored 
role in most investigations of functions. The following two problems illustrate 
their flexibility, and guide you through a derivation of their most important 
elementary properties. 


6. 


(a) If x, . .. , x, are distinct numbers, find a polynomial function 
f; of degree n — 1 which is 1 at x; and 0 at x; fory + 1. Hint: The 
product of all (x — x;) for; # @, is 0 at x; if 7 #2. (This product 
is usually denoted by 


n 


II (x "" Χρὴ), 


151 

ii 
the symbol II (capital pi) playing the same role for products that 
> plays for sums.) ( 
Now find a polynomial function f of degree n — 1 such that f(x;) = 
a;, where a1, ..., ἀρ are given numbers. (You should use the 
functions f; from part (a). The formula you will obtain is called the 
“Lagrange interpolation formula.’’) 
Prove that for any polynomial function f, and any number a, there 
is a polynomial function g, and a number ὦ, such that f(x) = 
(x — a)ge(x) + ὁ for all x. (The idea is simply to divide (x — a) into 
f(x) by long division, until a constant remainder is left. For example, 
the calculation 


(b 


nd 


(a 


ee” 


x? +x —2 
x — 1)x3 —3x+1 
ἐδ τὰ 
x? --.χ 
xt =x 
—2x +1 
—2x +2 


48 Foundations 


10. 


11. 


shows that x? — 3x - 1 = (x -- 1)(x? +x --α 2) τ 1. A formal 
proof is possible by induction on the degree of f.) 

(b) Prove that if f(a) = 0, then f(x) = (x — a)g(x) for some poly- 
nomial function g. (The converse is obvious.) 

(c) Prove that if f is a polynomial function of degree n, then f has at most 
n roots, 1.e., there are at most n numbers a with f(a) = 0. 

(d) Show that for each n there is a polynomial function of degree n with 
n roots. If n is even find a polynomial function of degree n with no 
roots, and if n is odd find one with only one root. 

For which numbers a, 6, c, and d will the function 


satisfy f(f(x)) = x for all x ἢ 
(a) If A is any set of real numbers, define a function C4 as follows: 


_f1, xinA 
Galt) = | 0, «x notin A. 
Find expressions for Cang and C4ygz and Cr_a, in terms of C4 and 
5. (The symbol A B was defined in this chapter, but the other 
two may be new to you. They can be defined as follows: 


AUB = {x: x isin A or x is in B}, 
ΒΕ — A = {x: x isin R but x is not in 4}.) 


(b) Suppose fis a function such that f(x) = 0 or 1 for each x. Prove that 
there is a set A such that f = C4. 

(c) Show that f = /? if and only if f = C4 for some set A. 

(a) For which functions f is there a function g such that f = g?? Hint: 
You can certainly answer this question if “function” is replaced by 
“number.” 

(b) For which functions f is there a function g such that f = 1/g ὃ 

*(c) For which functions ὁ and ¢ can we find a function x such that 


(x(t)? + b()x(t) + c(t) = 0 


for all numbers ¢ ἢ 


*(d) What conditions must the functions a and ὁ satisfy if there is to be 


a function x such that 
a(t)x(t) + b(t) = 0 


for all numbers ¢ ? How many such functions x will there be? 
(a) Suppose that His a function and y is a number such that H(H(y)) = 
y. What is 


H(H(E(: «+ (H(y) + + +)? 


80 times 


12. 


19: 


14͵ 


15. 


*16. 


Functions 49 


(b) Same question if 80 is replaced by 81. 
(c) Same question if H(H(y)) = H(y). 


*(d) Find a function H such that H(H(x)) = H(x) for all numbers x, 


and such that H(1) = 36, H(2) = 7/3, H(13) = 47, H(36) = 36, 
H(r/3) = 2/3, H(47) = 47. (Don’t try to ‘solve’ for H(x); there 
are many functions H with H(H(x)) = H(x). The extra conditions 
on H are supposed to suggest a way of finding a suitable H.) ἡ 
*(e) Find a function H such that H(H(x)) = H(x) for all x, and such that 
H(1) = 7, H(17) = 18. | 
A function f is even if f(x) = f(—x) and odd if f(x) = —f(—~). For 
example, f is even if f(x) = x? or f(x) = |x| or f(x) = cos x, while f is 
odd if f(x) = x or f(x) = sin x. 7 | 


(a) Determine whether f + g is even, odd, or not necessarily either, in 
the four cases obtained by choosing f even or odd, and g even or odd. 
(Your answers can most conveniently be displayed in a 2 X 2 
table.) 

(b) Do the same for Γ᾿ g. 

(c) Do the same for f ο g. 

(4) Prove that every even function f can be written f(x) = g(|x|), for 
infinitely many functions g. 


(a) Prove that any function f with domain Rcan be writtenf = E+ Ο, 
where £ is even and O is odd. 

(b) Prove that this way of writing f is unique. (If you try to do par: (Ὁ) 
first, by “‘solving” for E and O you will probably find the solution 
to part (a).) 

If f is any function, define a new function | f| by |f|(x) = | f(x)|. If f and 

g are functions, define two new functions, max(/, g) and min(f, g), by 


max(f, g)(x) = max(f(x), g(x)), 
min(f, g)(x) = min(f(x), g(x)). 


Find an expression for max(f, g) and min(f, g) in terms of | |. 

(a) Show that f = max(f,0) + min(f,0). This particular way of 
writing f is fairly useful; the functions max(f, 0) and min(/, 0) are 
called the positive and negative parts of /. 

(b) A function f is called nonnegative if f(x) > 0 for all x. Prove that 
any function f can be written f = g — A, where g and ἡ are non- 
negative, in infinitely many ways. (The “standard way” is g = 
max(f, 0) and h = — min(f, 0).) Hint: Any number can certainly 
be written as the difference of two nonnegative numbers in infinitely 
many ways. 


Suppose f satisfies f(x + y) = f(x) + f(y) for all x and y. 


(a) Prove that f(x: + + * + + xn) = [(ΧΚ ++ + fxn). 


50 Foundations 


7; 


*18. 


*19. 


*20. 


21. 


(b) Prove that there is some number ¢ such that f(x) = ex for all rational 
numbers x (at this point we’re not trying to say anything about f(x) 
for irrational x). Hint: First figure out what c must be. Now prove 
that f(x) = cx, first when + is a natural number, then when « is an 
integer, then when x is the reciprocal of an integer and, finally, for 
all rational x. 


If f(x) = 0 for all x, then f satisfies f(x + y) = f(x) + f(y) for all x and 
y, and also f(x: y) = f(x): f(y) for all x and y. Now suppose that f 
satisfies these two properties, but that f(x) is not always 0. Prove that 
f(x) = x for all x, as follows: 


I 


(a) Prove that f(1) = 1. 

(Ὁ) Prove that f(x) = x if x is rational. 

(c) Prove that f(x) > 0 if x > 0. (This part is tricky, but if you have 
been paying attention to the philosophical remarks accompanying 
the problems in the last two chapters, you will know what to do.) 

(d) Prove that f(x) > f(y) ifx > γ. 

(e) Prove that f(x) = x for all x. Hint: Use the fact that between any 

_ two numbers there is a rational number. 


Precisely what conditions must f, g, A, and k satisfy in order that f(x)g(y) 

= h(x)k(y) for all x and y? 

(a) Prove that there do not exist functions f and g with either of the 
following properties: 


(i) f(x) + g(y) = xy for all x and γ. 
(ii) f(x + y) = g(x) — » for all x and y. 


Hint: Try to get some information about f or g by choosing par- 
ticular values of x and y. 

(b) Find functions f and g such that f(x + y) = g(xy) for all x and y. 

(a) Find a function f, other than a constant function, such that | f(y) — 
fx) Ss ly — at. 

(b) Suppose that f(y) — f(x) < (y — x)? for all x and y. (Why does this 
imply that | f(y) — f(x)| < (ἡ — x)? ?) Prove that f is a constant 
function. Hint: Divide the interval [x, y] into n equal pieces. 


Prove or give a counterexample for each of the following assertions: 


(a) fo(g +h) =fegtfek. 
(Ὁ) (2 -Ph) ef = ges hey. 


1 1 
‘e) fee f 

1 1 
d = fo(_). 
ὍΣ ΤΣ ] : 


22; 


23. 


*24. 


Tao: 


*26. 


27. 


28. 


Functions 51 


(a) Suppose g = ho f. Prove that if f(x) = f(y), then g(x) = g(y). 

(b) Conversely, suppose that f and g are two functions such that g(x) = 
g(y) whenever f(x) = f(y). Prove that g = Ae f for some function 
h. Hint: Just try to define A(z) when z is of the form z = f(x) (these 
are the only z that matter) and use the hypotheses to show that your 
definition will not run into trouble. 

Suppose that fog = J, where /(x) = x. Prove that 


(a) if x # y, then g(x) # g(y); 
(Ὁ) every number ὁ can be written 6 = f(a) for some number a. 


(a) Suppose g is a function with the property that g(x) # g(y) ifx # γ. 
Prove that there is a function f such that fog = J. 

(Ὁ) Suppose that f is a function such that every number ὦ can be 
written ὦ = f(a) for some number a. Prove that there is a function 
g such that fog = 7. 

Find a function f such that g ο f = J for some g, but such that there 1s no 

function A with foh = J. 

Suppose fog = J and ho f = J. Prove that g = A. Hint: Use the fact 

that composition is associative. 

(a) Suppose f(x) = x + 1. Are there any functions g such that fog = 
gof? 

(Ὁ) Suppose f is a constant function. For which functions g does fog = 
ἂν ΑἹ 

(c) Suppose that fog = gof for all functions g. Show that ᾧ is the 
identity function, f(x) = x. 

(a) Let F be the set of all functions whose domain is R. Prove that, 
using + and - as defined in this chapter, all of properties P1—P9 
except P7 hold for F, provided 0 and 1 are interpreted as constant 
functions. 

(b) Show that P7 does not hold. 


*(c) Show that P10-P12 cannot hold. In other words, show that there is 


no collection P of functions in F, such that ΡΊ10--Ρ12 hold for P. (It 
is sufficient, and will simplify things, to consider only functions 
which are 0 except at two points x9 and x1.) 

(4) Suppose we define f < g to mean that f(x) < g(x) for all x. Which 
of P’10—P’13 (in Problem 1-8) now hold? 

(ε) Iff<g,ishof<hog?Isfokh<goh? 


52 Foundations 


DEFINITION 


THEOREM 1 


PROOF 


APPENDIX. ORDERED PAIRS 


Not only in the definition of a function, but in other parts of the book as 
well, it is necessary to use the notion of an ordered pair of objects. A definition 
has not yet been given, and we have never even stated explicitly what properties 
an ordered pair is supposed to have. The one property which we will require 
states formally that the ordered pair (a, ὁ) should be determined by a and ὁ, 
and the order in which they are given: 


if (a,b) = (c,d), then a = cand ὁ = d. 


Ordered pairs may be treated most conveniently by simply introducing 
(a, δ) as an undefined term and adopting the basic property as an axiom— 
since this property is the only significant fact about ordered pairs, there is not 
much point worrying about what an ordered pair “‘really’’ is. Those who find 
this treatment satisfactory need read no further. 

The rest of this short appendix is for the benefit of those readers who will 
feel uncomfortable unless ordered pairs are somehow defined so that this 
basic property becomes a theorem. There is no point in restricting our atten- 
tion to ordered pairs of numbers; it is just as reasonable, and just as important, 
to have available the notion of an ordered pair of any two mathematical 
objects. ‘This means that our definition ought to involve only concepts com- 
mon to all branches of mathematics. The one common concept which pervades 
all areas of mathematics is that of a set, and ordered pairs (like everything 
else in mathematics) can be defined in this context; an ordered pair will turn 
out to be a set of a rather special sort. 

The set {a, 5}, containing the two elements a and ὦ, is an obvious first 
choice, but will not do as a definition for (a, δ), because there is no way of 
determining from {a, 6} which of a or ὁ is meant to be the first element. A 
more promising candidate 15 the rather startling set: 


Lia}, ta, δ}}. 


This set has two members, both of which are themselves sets; one member is the 
set {a}, containing the single member a, the other is the set {a, δ). Shocking 
as it may seem, we are going to define (a, ὁ) to be this set. The justification 
for this choice is given by the theorem immediately following the definition— 
the definition works, and there really isn’t anything else worth saying. 


(a, b) = {fa}, ta, δ}. 


If (a, δ) = (c,d), then a = c and ὁ = d. 


The hypothesis means that 


iia}, fa, b}} = {ke}, fe, 4}}. 


Functions 53 


Now {{a!, {a, 6}} contains just two members, {a} and {a, δὲ; and a is the 
only common element of these two members of { {a}, {a, δ} ἢ. Similarly, ¢ is 
the unique common member of both members of {{c}, {c, d}}. Therefore 
a = c. We therefore have 


tla}, ta, bj} = taj, ta, ayy, 
and only the proof that 6 = d remains. It is convenient to distinguish 2 cases. 


Case 1. b = a. In this case, {a, 6} = {a}, so the set { {a}, {a, 6} } really has only 
one member, namely, {a}. The same must be true of { {a}, {a, d}}, so {a, dj = 
{a}, which implies that ὦ = a = ὁ. | 


Case 2. ὃ # a. In this case, ὁ is in one member of { {a}, {a, 6} } but not in the 
other. It must therefore be true that ὁ is in one member of { {a}, {a, d}} but 
not in the other. This can happen only if 4 is in {a, d}, but ὁ is not in {a}; thus 
b=aorb=d,buthb #a;sob=da. J 


CHAPTER GRAPHS 


Mention the real numbers to a mathematician and the image of a straight 
line will probably form in his mind, quite involuntarily. And most likely he 
will neither banish nor too eagerly embrace this mental picture of the real 
numbers. ‘‘Geometric intuition” will allow him to interpret statements about 
numbers in terms of this picture, and may even suggest methods of proving 
them. Although the properties of the real numbers which were studied in 
Part I are not greatly illuminated by a geometric picture, such an interpreta- 

tion will be a great aid in Part IT. 
You are probably already familiar with the conventional method of con- 
ΒΝ sidering the straight line as a picture of the real numbers, i.e., of associating 
= 0 ἐ 4 2 3 to each real number a point on a line. To do this (Figure 1) we pick, arbi- 
FIGURE 1 trarily, a point which we label 0, and a point to the right, which we label 1. 
The point twice as far to the right is labeled 2, the point the same distance 
from 0 to 1, but to the left of 0, is labeled —1, etc. With this arrangement, if 
a < ὁ, then the point corresponding to a lies to the left of the point corre- 
sponding to ὁ. We can also draw rational numbers, such as 4, in the obvious 
way. It is usually taken for granted that the irrational numbers also somehow 
fit into this scheme, so that every real number can be drawn as a point on the 
line. We will not make too, much fuss about justifying this assumption, since 
this method of ‘‘drawing” numbers is intended solely as a method of picturing 
certain abstract ideas, and our proofs will never rely on these pictures (al- 
though we will frequently use a picture to suggest or help explain a proof). 
Because this geometric picture plays such a prominent, albeit inessential role. 
geometric terminology is frequently employed when speaking of numbers— 
thus a number is sometimes called a point, and R is often called the real line. 

The number ἰὰ — δ has a simple interpretation in terms of this geometric 
picture: it is the distance between a and ὦ, the length of the line segment which 
has a as one end point and ὦ as the other. This means, to choose an example 
whose frequent occurrence justifies special consideration, that the set of 
numbers x which satisfy |x — αἱ < ¢ may be pictured as the collection of 
points whose distance from a is less than ¢. This set of points is the “interval” 
from a — € toa + ξ, which may also be described as the points corresponding 
to numbers x with a -- €< x <a-+eé (Figure 2). 

Sets of numbers which correspond to intervals arise so frequently that it is 
desirable to have special names for them. The set {x: a < x < ὁ) is denoted 
by (a, ὁ) and called the open interval from a to ὁ. This notation naturally 
creates some ambiguity, since (a, ὦ) is also used to denote a pair of numbers, 
but in context it is always clear (or can easily be made clear) whether one is 
talking about a pair or an interval. Note that if a > ὁ, then (a, δ) = @, the set 


54 


FIGURE 2 


Graphs 55 


A : a ΐ with no elements; in practice, however, it is almost always assumed (explicitly 

the open interval (a, 6) the closed interval [α, 6] if one has been careful, and implicitly otherwise), that whenever an interval 
(a, δ) is mentioned, the number a is less than ὁ. 

The set {x: a < x < 6} is denoted by [a, ὁ] and is called the closed interval 


Eee. from a to ὁ. This symbol is usually reserved for the case a < ὦ, but it is some- 
times used for a = ὁ, also. The usual pictures for the intervals (a, ὁ) and 

a a [α, ὁ] are shown in Figure 3; since no reasonably accurate picture could ever 

Sr es indicate the difference between the two intervals, various conventions have 
the interval (— =, a] been adopted. Figure 3 also shows certain “‘infinite’’ intervals. The set 
EF er του τες {x: x > a} is denoted by (a, 99), while the set {x: x > a} is denoted by [a, #); 
ee eno atenval (aye) the sets (— οὐ, a) and (— οὐ, a] are defined similarly. At this point a standard 


warning must be issued: the symbols © and —o, though usually read 
“infinity”? and ‘‘minus infinity,” are purely suggestive; there is no number 
‘“ ec”? which satisfies οὐ > a for all numbers a. While the symbols © and — 0 
0, ἢ}. (a, 6) will appear in many contexts, it is always necessary Ὃ define these uses in 
τ ways that refer only to numbers. The set R of all real numbers is also con- 

| sidered to be an “‘interval,’’ and is sometimes denoted by (— ©, ©). 
(--, 1 fb) | Of even greater interest to us than the method of drawing numbers is a 
. method of drawing pairs of numbers. This procedure, probably also familiar 
to you, requires a “‘coordinate system,” two straight lines intersecting at right 
angles. To distinguish these straight lines, we call one the horizontal axis, and 
one the vertical axis. (More prosaic terminology, such as the “‘first’? and 
‘“second”’ axes, is probably preferable from a logical point of view, but most 
people hold their books, or at least their blackboards, in the same way, so that 

FIGURE 4 ae - eee fone 

; horizontal” and “‘vertical”’ are more descriptive.) Each of the two axes could 
be labeled with real numbers, but we can also label points on the horizontal 


f(x) Ξ 1 
axis with pairs (a, 0) and points on the vertical axis with pairs (0, δ), so that 
the intersection of the two axes, the “origin”? of the coordinate system, is 
Pe labeled (0, 0). Any point (a, δ) can now be drawn as in F‘gure 4, lying at the 
x)= -τοὸῷΆὸ 


vertex of the rectangle whose other three vertices are labeled (0, 0), (a, 0), 
and (0, ὁ). The numbers a and ὦ are called the first and second coordinates, 
respectively, of the point determined in this way. 

Our real concern, let us recall, is a method of drawing functions. Since a 
function is just a collection of pairs of numbers, we can draw a function by 
drawing each of the pairs in the function. The drawing obtained in this way 
is called the graph of the function. In other words, the graph of f contains 
all the points corresponding to pairs (x, f(x)). Since most functions contain 
infinitely many pairs, drawing the graph promises to be a laborious under- 
taking, but, in fact, many functions have graphs which are quite easy to draw. 

Not surprisingly, the simplest functions of all, the constant functions 
f(x) = c, have the simplest graphs. It is easy to see that the graph of the func- 
tion f(x) = c is a straight line parallel to the horizontal axis, at distance ὁ 
from it (Figure 5). 

The functions f(x) = cx also have particularly simple graphs—straight 
lines through (0, 0), as in Figure 6. A proof of this fact is indicated in Figure 7: | 


FIGURE 5 


FIGURE 6 


56 Foundations 


A = (x, Ex) 


O=(0, 0) B’ 


FIGURE 7 


(c, d) 


length ὦ — b 


(a, ὁ) 


length c — a 


FIGURE 8 


FIGURE 9 


Let x be some number not equal to 0, and let L be the straight line which 
passes through the origin O, corresponding to (0,0), and through the point 
A, corresponding to (x, cx). A point A’, with first coordinate y, will lie on L 
when the triangle A’B’O is similar to the triangle ABO, thus when 


A’B’ AB 
.--  π-:-: -.-. - C3 
OB’ OB 


this is precisely the condition that A’ correspond to the pair (y, cy), i.e., that 
4’ lies on the graph of f. The argument has implicitly assumed that ¢ > 0, but 
the other cases are treated easily enough. The number c, which measures the 
ratio of the sides of the triangles appearing in the proof, is called the slope of the 
straight line, and a line parallel to this line is also said to have slope c. 

This demonstration has neither been labeled nor treated as a formal proof. 
Indeed, a rigorous demonstration would necessitate a digression which we are 
not at all prepared to follow. The rigorous proof of any statement connecting 
geometric and algebraic concepts would first require a real proof (or a pre- 
cisely stated assumption) that the points on a straight line correspond in 
an exact way to the real numbers. Aside from this, it would be necessary to 
develop plane geometry as precisely as we intend to develop the properties of 
real numbers. Now the detailed development of plane geometry is a beautiful 
subject, but it is by no means a prerequisite for the study of calculus. We shall 
use geometric pictures only as an aid to intuition; for our purposes (and for 
most of mathematics) it is perfectly satisfactory to define the plane to be the 
set of all pairs of real numbers, and to define straight lines as certain collections 
of pairs, including, among others, the collections { (x, cx): x areal number}. Τὸ 
provide this artificially constructed geometry with all the structure of geometry 
studied in high school, one more definition is required. If (a, δ) and (c, 4) are 
two points in the plane, i.e., pairs of real numbers, we define the distance 
between (a, δ) and (c, d) to be 


V(a —c)? + (ὦ — ἀ)3. 


If the motivation for this definition is not clear, Figure 8 should serve as 
adequate explanation—with this definition the Pythagorean theorem has 
been built into our geometry. * 

Reverting once more to our informal geometric picture, it is not hard to 
see (Figure 9) that the graph of the function f(x) = cx + dis a straight line 
with slope c, passing through the point (0, 4). For this reason, the functions 
f(x) = cx + dare called linear functions. Simple as they are, linear functions 
occur frequently, and you should feel comfortable working with them. The 
following is a typical problem whose solution should not cause any trouble. 
Given two distinct points (a, ὁ) and (c,d), find the linear function f whose 


* The fastidious reader might object to this definition on the grounds that nonnegative numbers 
are not yet known to have square roots. This objection is really unanswerable at the moment 
—the definition will just have to be accepted with reservations, until this little point is settled. 


(c) 
FIGURE 10 


(a) 


(b) 


Graphs 57 


graph goes through (a, 5) and (c, 4). This amounts to saying that f(a) = ὁ and 
f(c) = d. If f is to be of the form f(x) = ax + 8, then we must have 


aat+t B= ὁ, 
act β -- d; 
therefore a = (d — b)/(c — a) and B = ὁ — [(d — b)/(c — a)]a, so 
ΠΣ ΝΒ MER CARR A 
c—a c—a c—a 


a formula most easily remembered by using the ‘‘point-slope form’? (see 
Problem 5). 
Of course, this solution is possible only if a # c; the graphs of linear functions 
account only for the straight lines which are not parallel to the vertical axis. 
The vertical straight lines are not the graph of any function at all; in fact, the 
graph of a function can never contain even two distinct points on the same 
vertical line. This conclusion is immediate from the definition of a function— 
two points on the same vertical line correspond to pairs of the form (a, ὁ) and 
(a,c) and, by definition, a function cannot contain (a, 6) and (a,c) if ὁ ¥ ες. 
Conversely, if a set of points in the plane has the property that no two points 
lie on the same vertical line, then it is surely the graph of a function. Thus the 
first 2 sets in Figure 10 are not graphs of functions and the last 2 are; notice 
that the fourth is the graph of a function whose domain is not all of R, since 
some vertical lines have no points on them at all. 

After the linear functions the simplest is perhaps the function f(x) = x. 
If we draw some of the pairs in f, i.e., some of the pairs of the form (x, x”), we 
obtain a picture like Figure 11. 


FIGURE 11 


58 Foundations 


f(x) = x? 


FIGURE 12 


(c) 


FIGURE 13 


It isnot hard to convince yourself that all the pairs (x, x?) lie along a curve 
like the one shown in Figure 12; this curve is known as a parabola. 

Since a graph is just a drawing on paper, made (in this case) with printer’s 
ink, the question “15 this what the graph really looks like?” is hard to phrase 
in any sensible manner. No drawing is ever really correct since the line has 
thickness. Nevertheless, there are some questions which one can ask: for 
example, how can you be sure that the graph does not look like one of the 
drawings in Figure 13? It is easy to see, and even to prove, that the graph 
cannot look like (a); for if 0 < x < y, then x? < y*, so the graph should be 
higher at y than at x, which is not the case in (a). It is also easy to see, simply 
by drawing a very accurate graph, first plotting many pairs (x, x”), that the 
graph cannot have a large “jump” as in (b) or a “corner” as in (c). In order 
to prove these assertions, however, we first need to say, in a mathematical 
way, what it means for a function not to have a ‘‘jump” or ‘‘corner’’; these 
ideas already involve some of the fundamental concepts of calculus. Eventually 
we will be able to define them rigorously, but meanwhile you may amuse 
yourself by attempting to define these concepts, and then examining your 
definitions critically. Later these definitions may be compared with the ones 
mathematicians have agreed upon. If they compare favorably, you are cer- 
tainly to be congratulated! 

The functions f(x) = x”, for various natural numbers 7, are sometimes 
called power functions. Their graphs are most easily compared as in Figure 
14, by drawing several at once. 

The power functions are only special cases of polynomial functions, intro- 
duced in the previous chapter. Two particular polynomial functions are 


FIGURE 14 


FIGURE 15 


f(x) = x? -- 3x 


Graphs | 59 


graphed in Figure 15, while Figure 16 is meant to give a general idea of the 
graph of the polynomial function 


f(x) = a,x” ΒΕ Qe ik 4. o 8 6 + Qo, 


in the case a, > 0. 

In general, the graph of f will have at most n — 1 ‘‘peaks”’ or “‘valleys”’ (a 
‘“peak”’ is a point like (x, f(x)) in Figure 16, while a “‘valley” is a point like 
(y, f(y)). The number of peaks and valleys may actually be much smaller (the 
power functions, for example, have at most one valley). Although these 
assertions are easy to make, we will not even contemplate giving proofs until 
Part III (once the powerful methods of Part III are available, the proofs will 
be very easy). 

Figure 17 illustrates the graphs of several rational functions. The rational 
functions exhibit even greater variety than the polynomial functions, but their 
behavior will also be easy to analyze once we can use the derivative, the basic 
tool of Part ITI. 

Many interesting graphs can be constructed by “‘piecing together’ the 
graphs of functions already studied. The graph in Figure 18 is made up- 
entirely of straight lines. The function f with this graph satisfies 


(2) = (1), 
f(—=) = (0, 


f(x) = 1, jal 2 1, 


“--.... 


and is a linear function on each interval [1/(n + 1), 1.21] and [—1/n, 
—1/(n + 1)]. (The number 0 is not in the domain of f.) Of course, one can 
write out an explicit formula for f(x), when x is in [1/(n + 1), 1/n]; this is a 
good exercise in the use of linear functions, and will also convince you that a 
picture is worth a thousand words. 


n odd 
(a) (b) 
FIGURE 16 


60 Foundations 


f(x) = ἘΞ τ 


x 


(d) 
FIGURE 17 


| 
aay 
| 
ble 
| 
copes 
a) 
— 


FIGURE 18 


Graphs 61 


FIGURE 19 


It is actually possible to define, in a much simpler way, a function which 
exhibits this same property of oscillating infinitely often near 0, by using the 
sine function. In Chapter 15 we will discuss this function in detail, and 
radian measure in particular; for the time being it will be easiest to use degree 
measurements for angles. The graph of the sine function is shown in Figure 19 
(the scale on the horizontal axis has been altered so that the graph will be 
clearer; radian measure has, besides important mathematical properties, the 
additional advantage that such changes are unnecessary). 

Now consider the function f(x) = sin 1/x. The graph of f is shown in 
Figure 20. (To draw this graph it helps to first observe that 

1, Ἢ, id 
f(x) Ξ 0 tO > aan Ban 
1 1 


1 
f(x) = 1 OS ee ἘΝ ΠΥ 
90 90 + 360 90 + 720 


FE ee a ce ag ee ae ee τ 
270 270 + 360 270 + 720 


Notice that when x is large, so that 1/x is small, f(x) is also small; when x 


al 


. 1 
(eo sin > 


FIGURE 20 


62 Foundations 


΄ 
ΠΝ ΄ 
Ν 
Ν oe 
* / 
ΙΪ | 
te fil 


ἰὼ} 
΄ 
᾿ Ν 
aN 
va Ν 
Ζζ Ν 
Ν 
7 Ν 
΄ Ν 
γ΄ 2 
Ν 
je . 


FIGURE 21 


is ‘‘large negative,” that is, when |x| is large for negative x, again f(x) is close 
to 0, although f(x) < 0. 

An interesting modification of this function is f(x) = x sin 1/x. The graph 
of this function is sketched in Figure 21. Since sin 1/x oscillates infinitely often 
near 0 between 1 and —1, the function f(x) = x sin 1/x oscillates infinitely 
often between x and —x. The behavior of the graph for x large or large nega- 


\ 5 
\ 
\ / 
δ " 
\ 
\ / ee | 
\ / f(x) = αὐ sin ‘ 
. ; . 
\ 
᾿ 7 
] | i 
Ν 
il 
| | ἰ ΠῚ δι | | 


a 
I 
/ 
͵ \ 
\ 


FIGURE 22 


(0) 
FIGURE 23 


FIGURE 25 


(x, γ) 


Graphs 63 


tive is harder to analyze. Since sin 1/x is getting close to 0, while x is getting 
larger and larger, there seems to be no telling what the product will do. It zs 
possible to decide, but this is another question that is best deferred to Part 
III. The graph of f(x) = x? sin 1/x has also been illustrated (Figure 22). 
For these infinitely oscillating functions, it is clear that the graph cannot 
hope to be really ‘‘accurate.’’ The best we can do is to show part of it, and 
leave out the part near 0 (which is the interesting part). Actually, it 15 easy 
to find much simpler functions whose graph cannot be ‘‘accurately”’ drawn. 
The graphs of 
Mo, He Sol - RE. A 1 
a {5 oe BO τῶν ἡ 
can only be distinguished by some convention similar to that used for open 
and closed intervals (Figure 23). 
Our last example is a function whose graph is spectacularly nondrawable: 


_ { 0, x irrational 
ΛΞ 1, «x rational. 


ὼὸ ΦΦΦ  Φ 9 9 ὁ 9 9 9 9 9 9 9 9 


ΝΞ | 1, x rational 


0, x irrational 


FIGURE 24 


The graph of f must contain infinitely many points on the horizontal axis and 
also infinitely many points on a line parallel to the horizontal axis, but it must 
not contain either of these lines entirely. Figure 24 shows the usual textbook © 
picture of the graph. To distinguish the two parts of the graph, the dots are 
placed closer together on the line corresponding to irrational x. (There is 
actually a mathematical reason behind this convention, but it depends on 
some sophisticated ideas, introduced in Problems 20-6 and 20-7.) 

The peculiarities exhibited by some functions are so engrossing that it 15 
easy to forget some of the simplest, and most important, subsets of the plane, 
which are not the graphs of functions. The most important example of all 1s 
the circle. A circle with center (a, 6) and radius r > 0 contains, by definition, 
all the points (x, y) whose distance from (a, ὁ) is equal to r. The circle thus 
consists (Figure 25) of all points (x, y) with 


ΤΕΥ 
or 
α — a)? + (y — ἢ)" = τὶ 


64 Foundations 


The circle with center (0, 0) and radius 1, often regarded as a sort of standard 
copy, is called the unzt cercle. 

A close relative of the circle is the ellipse. This is defined as the set of points, 
the sum of whose distances from two fixed points is a constant. (When the two 
fixed points are the same, we obtain a circle.) If, for convenience, the two 
points are taken to be (—c, 0) and (c, 0), and the sum of the distances is taken 
to be 2a (the factor 2 simplifies some algebra), then (x, y) is on the ellipse if 
and only if 


νᾳ -- (- -ὁ)" τ + νὰ = δ᾽ Ἐν} = 2a 
or 
Veto py a Mae τῆν): 
or 


x? + 2ex τ᾿ οὗ + y? = 4a? — 4a V(x — 6)? 4 y? 4 x? — 2cx +c? + y? 


or 
4(ωχ — a®) = —4aV (x — 0)? +9? 
or 
εἶχ" — 2cxa? + at = a?(x? — 2cx +c? + y?) 
or 
(c2 — a®)x? — αἾγ = a2(c? — a?) 
or 


I, 


vo | 
+ 
| 
Ι 


where 6 = Va? -- c? (since we must clearly choose a > c, it follows that 
a? — c? > 0). A picture of an ellipse is shown in Figure 26. The ellipse 
intersects the horizontal axis when y = Ὁ, so that 


FIGURE 26 


x? 
(, }): a? ~~ 


FIGURE 27 


2 


May 


ὑ3 


Graphs 65 


and it intersects the vertical axis when x = 0, so that 
— = 1, y= +b. 


The hyperbola is defined analogously, except that we require the difference 
of the two distances to be constant. Choosing the points (—c, 0) and (c, 0) 
once again, and the constant difference as 2a, we obtain, as the condition that 
(x, y) be on the hyperbola, 


τ ΕΠ... 
which may be simplified to 


“Ὁ - 1. 
az a’ = C2 


In this case, however, we must clearly choose c > a, so that a — c? < 0. If 


b= Ve? — α", then (x, y) is on the hyperbola if and only if 


x? ἢ} 


α΄ ὁ 

The picture is shown in Figure 27. It contains two pieces, because the differ- 
ence between the distances of (x, y) from (—c, 0) and (c, 0) may be taken in 
two different orders. The hyperbola intersects the horizontal axis when y = 0, 
so that x = +a, but it never intersects the vertical axis. 

It is interesting to compare (Figure 28) the hyperbola with ἃ = 6 = V2 
and the graph of the function f(x) = 1/x. The drawings look quite similar, 
and the two sets are actually identical, except for a rotation through an angle 
of 45° (Problem 21). | 

Clearly no rotation of the plane will change circles or ellipses into the 
graphs of functions. Nevertheless, the study of these important geometric 
figures can often be reduced to the study of functions. Ellipses, for example, are 


x y 


7 (x, y): 


FIGURE 28 


66 Foundations 


made up of the graphs of two functions, 


f(x) = ὁ Ne χ ἃ"). —a<x<a 
and | 
g(x) = --ὖ Vis (t/a); -τ-ὰ 4 x <a. 


Of course, there are many other pairs of functions with this same property. 
For example, we can take 


f(x) = eee: O42 
SEV 1 = lays παζχερ 
and . 
᾿Ξ | τ νὝ1.:- (x?/a?), <x <a 
bV1— (x?/a?), —a<x <0. 
We could also choose 
᾿ξ ᾿: Vi -- (x77 a7). xrational —-a<x«<a 
ZEA (x?/a*), «x irrational, -—a<x<a 
and 
HOt | SN = (x*/a*), xrational -a<x«x<a 
bV1— ΩΣ x irrational, -—a<x< 


But all these other pairs necessarily involve unreasonable functions which | 
jump around. A proof, or even a precise statement of this fact, is too difficult 
at present. Although you have probably already begun to make a distinction 
between those functions with reasonable graphs, and those with unreasonable 
graphs, you may find it very difficult to state a reasonable definition of reason- 
able functions. A mathematical definition of this concept is by no means 
easy, and a great deal of this book may be viewed as successive attempts 
to impose more and more conditions that a ‘“‘reasonable” function must 
satisfy. As we define some of these conditions, we will take time out to ask if we 
have really succeeded in isolating the functions which deserve to be called 
reasonable. The answer, unfortunately, will always be ‘“‘no,” or at best, a 
qualified ‘‘yes.” 


PROBLEMS 


1. Indicate on a straight line the set of all x satisfying the following condi- 
tions. Also name each set, using the notation for intervals (in some cases 
you will also need the U sign). 


(i) |x — 3] <1. 
(ii) |x — 3) <1. 
ΠΣ eee 
(iv) |x? — 1| < 4. 


Graphs 67 


ν > 1. 
Wi τ Ξὅ 
(vi) lax < a (give an answer in terms of a, distinguishing various 
Χ 
cases). 


(vii) x? -+1 > 2. 
(viii) (x - 1) — 1) — 2) > 0. 
Draw the set of all points (x, y) satisfying the following conditions. (In 


most cases your picture will be a sizable portion of a plane, not just a 
line or curve.) 


(ie. χ»η. 
Gs Sea Seo: 
(ili) γ « x*, 
(iv) y< x. 


(Ve ἰχ --᾿ ἰλ«Ί. 
(vi) ja ty| <1. 
(vii) x + y is an integer. 


(viii) is an integer. 
(ix) @—1)?+ (Gy -- 2)? <1. 
(Χ) αὐ ee 


Draw the set of all points (x, y) satisfying the following conditions. 


ὦ) xl + [0] = 1. 

(ἰ} x yy Ts 

(iii) |e — 1) = [ν — 1]. 
(iv) [1 — αἱ = jy — 1. 
(Vv) x+y = 

(vi) xy = 0. 

(vit) x? — 2x + y? = 4. 
(vili) x? = y?. 


Draw the set of all points (x, y) satisfying the following conditions: 


Gj) x = γ3. 
ἧς, 3... ee 
ἢ) ΞΞ ie sl 
(ii) ae 
(iii) x = [γ]. 


(iv) « = sin y. 
Hint: You already know the answers when x and y are interchanged. 


(a) Show that the straight line through (a, δ) with slope m is the graph 
of the function f(x) = m(x — a) - ὁ. This formula, known as the 


68 Foundations 


FIGURE 29 


(1, m) 


(1, n) 


‘“point-slope form” is far more convenient than the equivalent 
expression f(x) = mx + (ὁ — ma); it is immediately clear from the 
point-slope form that the slope is m, and that the value of f at 
ais b. 

(b) For a κέ c, show that the straight line through (a, 6) and (c, d) is the 
graph of the function 


Sa γος γιοῦ, 
a 


C= 


f(x) = 


(c) When are the graphs of f(x) = mx + band g(x) = m'x + δ' parallel 
straight lines? 

(a) For any numbers A, B, and C, with A and B not both 0, show that 
the set of all (x, y) satisfying Ax + By + C = 0 is a straight line 
(possibly a vertical one). Hint: First decide when a vertical straight 
line is described. 

(b) Show conversely that every straight line, including vertical ones, 
can be described as the set of all (x, y) satisfying Ax + By + C = 0. 

(a) Prove that the graphs of the functions 


f(x) = mx + 6, 
δ.) = ΣῈ ἢ, 
are perpendicular if mn = —1, by computing the squares of the 


lengths of the sides of the triangle in Figure 29. (Why is this special 
case, where the lines intersect at the origin, as good as the general 
case?) 
(b) Prove that the two straight lines consisting of all (x, y) satisfying 
the conditions 
. Ax + By + C= 0, 
A’x + Bly + C’ = 0, 


are perpendicular if and only if AA’ + BB’ = 0. 
(a) Prove, using Problem 1-18, that 


Vier + Ne? ae tee pe)? Vie? - χα + Vy? =F Was 
(b) Prove that 


V (x5 = a ΕΙΣ (ys Vi)? < να, = x1)? ΞΕ (Vo = yi)? 
a V (x3 7; x2)? =“ (ys — ye)”. 


Interfret this inequality geometrically (it is called the “triangle 
inequality”). When does strict inequality hold? 
Sketch the graphs of the following functions, plotting enough points to 
get a good idea of the general appearance. (Part of tne problem is to 
make a reasonable decision how many is “‘enough’’; the queries posed 
below are meant to show that a little thought will often be more valuable 
than hundreds of individual! points.) 


10. 


11. 


12. 


13. 


14. 


Graphs 69 


Gi) f(x) = «+ (What happens for x near 0, and for large x? Where 
χ 


does the graph lie in relation to the graph of the identify function? 
Why does it suffice to consider only positive x at first?) 


(i) fa) ταῖς 


(il) 70) =P Ἐπὶ 
(v) 7 = αὐ = τς 


Describe the general features of the graph of f if 


(i) fis even. 

(ii) f is odd. 

(iii) f is nonnegative. 

(iv) f(x) = f(x + a) for all x (a function with this property is called 
᾿ periodic, with period a). 


Graph the functions f(x) = Wx for πὶ = 1, 2, 3, 4. (There is an easy 
way to do this, using Figure 14. Be sure to remember, however, that 
Vx means the positive mth root of x when m is even; you should also note 
that there will be an important difference between the graph when m is 
even and when m is odd.) 


(a) Graph f(x) = |x| and f(x) = x’. 

(Ὁ) Graph f(x) = |sinx| and f(x) = sin?x. (There is an important 
difference between the graphs, which we cannot yet even describe 
rigorously. See if you can discover what it is; part (a) is meant to 
be a clue.) 


Describe the graph of g in terms of the graph of / if 


() g(x) = f(x) +. 

(ii) g(x) = f(x - ὦ. (It is easy to make a mistake here.) 

(1) g(x) = of (a). istinguls e casesc = 0, ¢ C 

id n= He. (Distinguish the cases 0,c>0,c <0.) 
(v) g(x) = f(1/x). 

(vi) g(x) = χα"). 

(vii) g(x) = |[f()]- 

(viii) g(x) = max(f, 0). 

(ix) g(x) = min(f, 0). 

(x) g(x) = max(f, 1). 


Draw the graph of f(x) = ax? + bx +c. Hint: Use the methods of 
Problem 1-17. 


70 


Foundations 


15. The symbol [x] denotes the largest integer which is <x. Thus [2.1] = 
[2] = 2 and [—0.9] = [—1.2] = —1. Draw the graph of the following 
functions (they are all quite interesting, and several will reappear 
frequently in other problems). 


(i) f(x) = [x]. 

(ii) f(x) = x — [x]. 

(iii) f(x) = Wx — [x]. 

(iv) f(x) = [x] + Vx -- [x]. 


w) fo) = [=] 
(vi) fle) = -ἰ- 


" 


16. Graph the following functions. 


(i) f(x) = {x}, where {x} is defined to be the distance from x to the 
nearest integer. 

(ii) f(x) = {2x}. 

(iii) f(x) = {x} + $f 2x}. 

(iv) f(x) = {4x}. 

(v) fx) = {x} + δί2χ! + Ff4x}. 


Many functions may be described in terms of the decimal expansion of a 
number. Although we will not be in a position to describe infinite decimals 
rigorously until Chapter 22, your intuitive notion of infinite decimals should 
suffice to carry you through the following problem, and others which occur 
before Chapter 22. There is one ambiguity about infinite decimals which 
must be eliminated: Every decimal ending in a string of 9’s is equal to another 
ending in a string of 0’s (e.g., 1.23999 . . . = 1.24000 . . .). We will always 
use the one ending in 9’s 


*17. Describe as best you can the graphs of the following functions (a com- 
plete picture is usually out of the question). 


(i) f(x) = the 1st number in the decimal expansion of x. 

(ii) f(x) = the 2nd number in the decimal expansion of x. 

(iii) f(x) = the number of 7’s in the decimal expansion of x if this 
number is finite, and 0 otherwise. | 

(iv) f(x) = 0 if the number of 7’s in the decimal expansion of x is 
finite, and 1 otherwise. 

(v) f(x) = the number obtained by replacing all digits in the decimal 
expansion of x which come after the first 7 (if any) by 0. 

(vi) f(x) = 0 if 1 never appears in the decimal expansion of x, and n 
if 1 first appears in the nth place. 


FIGURE 30 


FIGURE 31 


*18. 


19. 


*20. 


ΟΊ. 


Graphs 71 


Let 


0, «x irrational 
70) = ᾽ 


: x =~ rational in lowest terms. 

q q 

(A number p/q is in lowest terms if p and gare integers with no common 

factor, and g > 0). Draw the graph of f as well as you can (don’t 

sprinkle points randomly on the paper; consider first the rational num- 

bers with φ = 2, then those with φ = 3, etc.). 

(a) The points on the graph of f(x) = x? are the ones of the form (x, x’). 
Prove that each such point is equidistant from the point (0, 7) and 
the graph of g(x) = —4. (See Figure 30.) 

(Ὁ) Given a point P = (a, β) and a horizontal line L, the graph of 
g(x) = Ὕ, show that the set of all points (x, y) equidistant from P and 
L is the graph of a function of the form f(x) = ax? + be +e. 

(a) Show that the square of the distance from (c, d) to (x, mx) is 


x2(m? - 1) + x(—2md — 2c) + d? + οἷ. 
Using Problem 1-17 to find the minimum of these numbers, show 
that the distance from (c, 4) to the graph of f(x) = mx is 
lcm — d\/V m? + 1. 


(Ὁ) Find the distance from (c,d) to the graph of f(x) = mx + ὁ. 
(Reduce this case to part (a).) 

(a) Using Problem 20, show that the numbers x’ and y’ indicated in 
Figure 31 are given by 


1 1 
x= ex t+ το 9 
2 V2" 


/ 


1 1 
i ν᾽ 


(Ὁ) Show that the set of all (x, y) with (x’/ V2)? — (y'/ V2)? = 1 isthe 
same as the set of all (x, y) with xy = 1. 


CHAPTER 


PROVISIONAL DEFINITION 


LIMITS 


The concept of a limit is surely the most important, and probably the most 
difficult one in all of calculus. The goal of this chapter is the definition of 
limits, but we are, once more, going to begin with a provisional definition; 
what we shall define is not the word “π᾿ but the notion of a function 
approaching a limit. 


The function f approaches the limit / near a, if we can make f(x) as close as we 
like to / by requiring that x be sufficiently close to, but unequal to, a. 


Of the six functions graphed in Figure 1, only the first three approach / at a. 
Notice that although g(a) is not defined, and h(a) is defined ‘‘the wrong way,”’ 
it is still true that g and A approach / near a. This is because we explicitly 
ruled out, in our definition, the necessity of ever considering the value of the 
function at a—it is only necessary that f(x) should be close to / for x close to a, 
but unequal to a. We are simply not interested in the value of f(a). or even in 
the question of whether f(a) is defined. 


i+ “ ͵ “ 


7 
a+ 
| 


FIGURE 1 


One convenient way of picturing the assertion that f approaches / near a is 
provided by a method of drawing functions that was not mentioned in 
Chapter 4. In this method, we draw two straight lines, each representing R, 
and arrows from a point x in one, to f(x) in the other. Figure 2 illustrates such 
a picture for two different functions. 


72 


(a) f(x) Ξ ς 


-2 —1 0 
—2 —1 0 
(0) f(x) = x? 
FIGURE 2 


FIGURE 3 


FIGURE 4 


Limits 73 


Now consider a function f whose drawing looks like Figure 3. Suppose we 
ask that f(x) be close to /, say within the open interval B which has been 
drawn in Figure 3. This can be guaranteed if we consider only the numbers x 
in the interval A of Figure 3. (In this diagram we have chosen the largest 
interval which will work; any smaller interval containing a could have been 
chosen instead.) If we choose a smaller interval B’ (Figure 4) we will, usually, 
have to choose a smaller A’, but no matter how small we choose the open 
interval B, there is always supposed to be some open interval A which works. 

A similar pictorial interpretation is possible in terms of the graph of f, but 
in this case the interval B must be drawn on the vertical axis, and the set A 
on the horizontal axis. The fact that f(x) is in B when x is in A means that the 
part of the graph lying over A is contained in the region which is bounded 
by the horizontal lines through the end points of B; compare Figure 5(a), 
where a valid interval A has been chosen, with Figure 5(b), where A is too 
large. 

In order to apply our definition to a particular function, let us consider 
f(x) = x sin 1/x (Figure 6). Despite the erratic behavior of this function near. 
0 it is clear, at least intuitively, that f approaches 0 near 0, and it is cer- 
tainly to be hoped that our definition will allow us to reach the same con- 
clusion. In the case we are considering, both a and / of the definition are 0, so 
we must ask if we can get f(x) = x sin 1/x as close to 0 as desired if we require 
that x be sufficiently close to 0, but ¥ 0. To be specific, suppose we wish to 
get x sin 1/x within τίσ of 0. This means we want 

1 
—-—-<xsin-<— 
10 


x 10 


or, more succinctly, |x sin 1/x| < τσ. Now this is easy. Since 


<1, forall x #0, 


«1 
sin - 
x 


— — ce ae 


FIGURE 5 


74 Foundations 


FIGURE 6 


FIGURE 8 


we have 


< |x|, for all x ¥ 0. 


. 1 
x sin — 
x 


This means that if |x| < τίς and x κέ 0, then |x sin 1/x| < τσ; in other words, 
x sin 1/x is within 75 of 0 provided that x is within 745 of 0, but ¥ 0. There 
is nothing special about the number 75; it is just as easy to guarantee that 
| f(x) — ΟἹ « τὲ —simply require that |x| < τόσ, but x + 0. In fact, if we 
take any positive number € we can make |/f(x) — 0| < ¢ simply by requiring 
that |x| « ¢, and x <0. 

For the function f(x) = x? sin 1/x (Figure 7) it seems even clearer that f 
approaches 0 near 0. If, for example, we want 


then we certainly need only require that |x| < τσ and x # 0, since this implies 
that |x?| < $9 and consequently 


FIGURE 7 


(We could do even better, and allow |x| « 19} V10 and x # 0, but there is no 
particular virtue in being as economical as possible.) In general, if ¢ > 0, to 
ensure that 


1 
x? sin — 
x 


< €, 


we need only require that 


Ix] <e€ and x #0, 


FIGURE 10 


Limits 75 


ἢ = 
οτος, 
| f(x) = sin : 


jd | Il) 7 
| 


provided that ¢ < 1. If we are given an ε which is greater than 1 (it might be, 
even though it is “‘small’ e’s which are of interest), then it does not suffice to 
require that |x| < ¢, but it certainly suffices to require that |x| < 1 and x # 0. 


As a third example, consider the function f(x) = V x4 sin 1/x (Figure 8)- 


FIGURE 9 


In order to make Vx} sin 1/x| < € we can require that 
xi <6? and x #0, 


if ¢ < 1, or that |x| < 1 and x ¥ 0 if ¢ > 1 (the algebra is left to you). 

Finally, let us consider the function f(x) = sin 1/x (Figure 9). For this 
function it is false that f approaches 0 near 0. This amounts to saying that it is 
not true for every number € > 0 that we can get | f(x) — ΟἹ < € by choosing 
x sufficiently small, and # 0. To show this we simply have to find one € > 0 
for which the condition | f(x) — 0| < ¢ cannot be guaranteed, no matter how 
small we require |x| to be. In fact, ¢ = 4 will do: it is impossible to ensure that 
| f(x)| < $no matter how small we require |x| to be; for if A is any interval con- 
taining 0, there is some number x = 1/(90 + 360n) which is in this interval, 
and for this x we have f(x) = 1. 

This same argument can be used (Figure 10) to show that f does not 
approach any number near 0. To show this we must again find, for any par- 
ticular number /, some number ¢€ > 0 so that |f(x) — /| < € is not true, no 
matter how small x is required to be. The choice ¢ = ὁ works for any number 
/; that is, no matter how small we require |x| to be, we cannot ensure that 
| f(x) — ἢ < 4. The reason is, that for any interval A containing 0 there is 
some x; = 1/(90 + 360n) in this interval, so that 


fim) = 1, 
and also some x. = 1/(270 + 360m) in this interval, so that 


f(xe) = 1. 


76 Foundations 


x, X rational 


e ae 2 0. x Irrational 


FIGURE 11 


FIGURE 12 


FIGURE 13 


f(x) = 1,*>0 


But the interval from ὦ ~ 4to/ + $ cannot contain both —1 and 1, since its 
total length is only 1; so we cannot have 


1 — 2| < $and also |-1 --ἰ[ <4, 


no matter what ὦ is. 
The phenomenon exhibited by f(x) = sin 1/x near 0 can occur in many 
ways. If we consider the function 


0, «x irrational 
[Ay | 1, x rational, 


then, no matter what a is, f does not approach any number / near a. In fact, 
we cannot make | f(x) — /| < ΖΦ no matter how close we bring x to a, because 
in any interval around a there are numbers x with f(x) = 0, and also numbers 
x with f(x) = 1, so that we would need |0 — | <q and also |1 — Jj < 1, 
An amusing variation on this behavior is presented by the function shown 


in Figure 11: 
_ |x, x rational 
es ee 0, «x irrational. 


2 


The behavior of this function is “opposite” to that of g(x) = sin 1/x; it 
approaches 0 at 0, but does not approach any number at a, if a # 0. By now 
you should have no difficulty convincing yourself that this is true. 

As a contrast to the functions considered so far, which have been quite 
pathological, we will now examine some of the simplest functions. 

If f(x) = c, then f approaches c near a, for every number a. In fact, to ensure 
that | f(x) — εἶ < € one does not need to restrict x to be near a at all; the condi- 
tion is automatically satisfied (F igure 12). 

As a slight variation, let f be the function shown in F igure 13: 


=f. ep 
1 o> 0, 


fla) = | 


Ifa > 0, then f approaches 1 near a: indeed, to ensure that | f(x) — 1| < eit 
certainly suffices to require that ix — al <a, since this implies 


orl? OR. Rea ἃ 
or 0 « x, 


so that f(x) = 1. Similarly, if ὁ < 0, then f approaches —1 near ὁ: to ensure 
that | f(x) — (—1)| < €é it suffices to require that |x — ὁ] < —d. Finally, as 
you may easily check, f does not approach any number near 0. 

The function f(x) = x is easily dealt with. Clearly f approaches a near a: to 
ensure that | f(x) — αἱ < € we just have to require that |x — αἱ < ξε. 

The function f(x) = x? requires a little more work. To show that f ap- 
proaches a” near a, we must decide how to ensure that 


|x? — a2| < ε. 


Limits 77 


Factoring looks like the most promising procedure: we want 
Ix — al: |x+tal <e. 


Obviously the factor |x + αἱ is the one that will cause trouble. On the other 
hand, there is no need to make ‘x + αἱ particularly small; as long as we know 
some bound on the values of |x + αἱ we will be in good shape. For example, 
if lx + αἱ < 1,000,000, then we will just need to require that |x — αἱ < 
¢/1,000,000. Therefore, to begin with, let us require that |x — αἱ < 1 (any 
positive number other than 1 would do just as well); presumably this will 
ensure that x is not too large, and consequently that |x + αἱ is not too large. 
As a matter of fact, Problem 1-12 shows that 


Ix] — ἰαὶ < |x -- αἱ <1, 
SO 
| a] <1 + fal, 
and consequently 


lx + αἱ < |x| + lal < 21αἱ + 1. 


Now we need only the additional requirement that |x — αἱ < €/(2\a| + 1). 
In other words, | 


if lx — al <min(1, == —)) then [x — at) <e. 
2\a| + 1 


Naturally, min(1, ¢/(2|a| + 1)) will just be ¢/(2|a| + 1) for small ¢. 

Precisely the same sort of trick will show that if f(x) = x, then f approaches 
a’ near a. In fact, 
ἘΞ ΨΥ ee 
(1 + jal)? + Jal(1 + Jal) + lal? 
The proof of this assertion will show where the weird denominator comes 
from: If |x — αἱ < 1, then |x| « [αἱ + 1, and consequently 

|x? + ax + a7] < [x]? + Jal: x] + lal? 
< (1 + lal)? + Jail + lal) + lal? 


if |x — αἱ < min (1, ) then |x? — a‘| « ς. 


Therefore 
jx? — a3] = |x — αἱ " |x? + ax + a?| 
τς ὡς. -ο - a\)? α (1 a ἃ" 
<i ae ΓΕῪ τῆ ee ἐν. μὰς. 


= Ss 


The time has now come to point out that of the many demonstrations about 
limits which we have given, not one has been a real proof. The fault les not 
with our reasoning, but with our definition. If our provisional definition of 
a function was open to criticism, our provisional definition of approaching a 
limit is even more vulnerable. This definition is simply not sufficiently precise 
to be used in proofs. It is hardly clear how one ‘‘makes”’ f(x) close to / (what- 
ever ‘“‘close’? means) by ‘‘requiring”’ x to be sufficiently close to a (however 


78 


Foundations 


DEFINITION 


close “sufficiently” close is supposed to be). Despite the criticisms of our 
definition you may feel (I certainly hope you do) that our arguments were 
nevertheless quite convincing. In order to present any sort of argument at all, 
we have been practically forced to invent the real definition. It is possible to 
arrive at this definition in several steps, each one clarifying some obscure 
phrase which still remains. Let us begin, once again, with the provisional 
definition: 


The function f approaches the limit / near a, if we can make f(x) as close 
as wé like to / by requiring that x be saficiendy close to, but unequal to, a. 


The very first change which we made in this definition was to note that making 
f(x) close to / meant making | f(x) — /| small, and similarly for x and a: 


The function f approaches the limit / near a, if we can make | f(x) — /| 
as small as we like by requiring that |x — αἱ be sufficiently small, and 
x Fa. 


The second, more crucial, change was to note that making f(x) -- “ἃ 


small as we like” means making | f(x) —  « ¢ for any € > 0 that ee to 
be given: us: 


The function f approaches the limit / near a, if for every number ¢ > 0 
we can make |f(x) — /| < € by requiring that |x — αἱ be sufficiently 
small, and x ¥ a. 


There is a common pattern to all the demonstrations about limits which we 
have given. For each number ¢ > 0 we found some other positive number, 
ὃ say, with the property that if x # a and |x — αἰ < 6, then |f(x) — /| < ε. 
For the function f(x) = x sin 1/x (with a = 0,/ = 0), ihe number 6 was just 
the number -¢; for Fea = - Vix | sin 1/x, it was εὖ if ε <1, and it was 1 if 
€ > 1; for f(x) = x? it was the minimum of 1 and ¢/(2\a| + 1). In general, it 
may not be at all clear how to find the number 6, given ¢, but it is the condi- 
tion |x — αἱ < 6 which expresses how small “sufficiently” small must be: 


The function f approaches the limit / near a, if for every ¢ > 0 there is 
some 6 > 0 such that, for all x, if |x — αἱ < ὃ and x # a, then |f(x) — ἡ} 
ey 


This is practically the definition we will adopt. We will make only one trivial 
change, noting that “|x — al < ὃ and x ¥ a” can just as well be expressed 
“0 < |x — al < 6.” 


The function f approaches the limit / near a means: for every € > 0 there 
is some 6 > 0 such that, for all x, if 0 < |x — αἰ < 6, then |f(x) — ἐΪ « ε. 


Limits 79 


This definition is so important (everything we do from now on depends on it) 
that proceeding any further without knowing it is hopeless. If necessary 
memorize it, like a poem! That, at least, is better than stating it incorrectly; 
if you do this you are doomed to give incorrect proofs. A good exercise in 
giving correct proofs is to review every fact already demonstrated about func- 
tions approaching limits, giving real proofs of each. This requires writing 
down the correct definition of what you are proving, but not much more— 
all the algebraic work has been done already. When proving that f does not 
approach / at a, be sure to negate the definition correctly: 


If it is not true that 


for every ¢ > 0 there is some 6 > 0 such that, for all x, if0 < [x — αἱ < 6, 
then | f(x) —  « ¢, 


then 


there is some € > 0 such that for every 6 > Ὁ there is some x which satisfies 
0 < lx — αἱ < ὃ but not | f(x) -- «ς. 


Thus, to show that the function f(x) = sin 1/x does not approach 0 near 0, we 
consider ε = }.and note that for every 6 > 0 there issome x with0O < |x — 0| 
< ὃ but not |sin 1/x — 0| <  --- namely, an x of the form 1/(90 + 360n), 
where n is so large that 1/(90 + 360n) < 6. 

As an illustration of the use of the definition of a function approaching a 
limit, we have reserved the function shown in Figure 14, a standard example, 
but one of the most complicated: 


oe 0, x irrational, ὁ <x < 1 
ae 1/q, x = p/q in lowest terms, 0 <x «1. 


(Recall that p/q is in lowest terms if p and q are integers with no common 
factor and g > 0.) 


4 Φ 
0, x irrational 
f(x) = ᾿ χ -- ; in lowest terms 


FIGURE 14 


80 Foundations 


THEOREM 1 


PROOF 


For any number a, with 0 < a < 1, the function f approaches 0 at a. To 
prove this, consider any number ¢ > 0. Let n be a natural number so large 
that 1/n < ¢. Notice that the only numbers x for which | f(x) — 0| < € could 
be false are: 


(If @ is rational, then a might be one of these numbers.) However many of 
these numbers there may be, there are, at any rate, only finitely many. There- 
fore, of all these numbers, one is closest to a; that is, |p/q — αἱ is smallest for 
one p/q among these numbers. (If a happens to be one of these numbers, then 
consider only the values |p/q — αἱ for p/q τέ a.) This closest distance may be 
chosen as the 6. For if 0 < |x — αἱ < 6, then x is not one of 


and therefore | f(x) — 0| < ε ἐς true. This completes the proof. Note that our 
description of the 6 which works for a given ε is completely adequate—there 
is nO reason why we must give a formula for 6 in terms of €. 

Armed with our definition, we are now prepared to prove our first theorem; 
you have probably assumed the result all along, which is a very reasonable 
thing to do. This theorem is really a test case for our definition: if the theorem 
could not be proved, our definition would be useless. 


A function cannot approach two different limits near a. In other words, if f 
approaches / near a, and f approaches m near a, then / = m. 


Since this is our first theorem about limits it will certainly be necessary to 
translate the hypotheses according to the definition. 
Since f approaches / near a, we know that for any € > 0 there is some num- 


_ ber 6, > 0 such that, for all x, 


if 0 < |x — al < 6), then | f(x) — ἢ < «. 


We also know, since f approaches m near a, that there is some 62 > 0 such 
that, for all x, 


if 0 < |x — αἱ < de, then |f(x) — m| < «. 


We have had to use two numbers, δὶ and 62, since there is no guarantee that 
the 6 which works in one definition will work in the other. But, in fact, it 1s 
now easy to conclude that for any € > 0 there is some 6 > 0 such that, for 
all x, 


if 0 < |x — αἱ < 6, then |f(x) — /| < ε and |f(x) — m| < ε; 


we simply choose ὃ = min(61, δι). 


Limits 81 


To complete the proof we just have to pick a particular € > 0 for which the 
two conditions 
f(x) -- «εἴ and |f(x) -- πὶ <e 
cannot both hold, if / γέ m. The proper choice is suggested by Figure 15. If 
ἰ x m, so that |J — m| > 0, we can choose |/ — m|/2 as our ε. It follows that 
there is a 6 > 0 such that, for all x, 
| =m 
2 
Ι -- πὶ 
1 Ξ 


10 < |x -- αἱ « ὃ, then |f(x) -- ἰ « 


and | f(x) — m| < 


FIGURE 15 This implies that for 0 < |x — αἰ < ὃ we have 
i — ml = f(x) + 70) -- ml <P) + 70.) — πὶ 


2 —m| , \|l—ml| 
< aes Ε ΞΕ. ΘΟβς ΞΟ οὐδο en 
2 δὲ 2 
= [ἰ -- ml, 
a contradiction. J 


The number / which f approaches near a is denoted by lim f(x) (read: the 
limit of f(x) as x approaches a). This definition is possible only because of 
Theorem 1, which ensures that lim f(x) never has to stand for two different 


numbers. The equation 
lim f(x) = / 
ra 


has exactly the same meaning as the phrase 


f approaches / near a. 


The possibility still remains that f does not approach / near a, for any /, so that 
lim f(x) = / is false for every number /. This is usually expressed by saying 


za 
that “lim f(x) does not exist.” 
za 
Notice that our new notation introduces an extra, utterly irrelevant letter 


x, which could be replaced by #, y, or any other letter which does not already 
appear—the symbols 


lim f(x), lim f(t), —_—ilim f(y), 


all denote precisely the same number, which depends on f and a, and has 
nothing to do with x, ¢, or y (these letters, in fact, do not denote anything at 
all). A more logical symbol would be something like lim f, but this notation, 


despite its brevity, is so infuriatingly rigid that almost no one has seriously 
tried to use it. The notation lim f(x) is much more useful because a function 
z—~a 


82 Foundations 


7 often has no simple name, even though it might be possible to express f(x) 
by a simple formula involving x. Thus, the short symbol 


lim (x? + sin x) 
could be paraphrased only by the awkward expression 


lim 7, where f(x) = x? + sin x. 


Another advantage of the standard symbolism is illustrated by the expres- 
sions 
lim x + #3, 


za 


lim x + #3. 


tra 

The first means the number which f approaches near a when 
f(x) =x +2, for all x: 

the second means the number which f approaches near a when 
f(t) =x +23, for all t. 


You should have little difficulty (especially if you consult Theorem 2) proving 
that 
lim x + #% =  4- #3, 


ca 


lim x + 8 = x + a3, 
ta 
These examples illustrate the main advantage of our notation, which is its 
flexibility. In fact, the notation lim f(x) is so flexible that there is some danger 
ra 


of forgetting what it really means. Here is a simple exercise in the use of this 
notation, which will be important later: first interpret precisely, and then 
prove the equality of the expressions 


lim f(x) and lim f(a + A). 
2—a h-0 


An important part of this chapter is the proof of a theorem which will 
make it easy to find many limits. The proof depends upon certain properties 
of inequalities and absolute values, hardly surprising when one considers the 
definition of limit. Although these facts have already been stated in Problems 
1-19, 1-20, and 1-21, because of their importance they will be presented once 
again, in the form of a lemma (a lemma is an auxiliary theorem, a result that 
justifies its existence only by virtue of its prominent role in the proof of 
another theorem). The lemma says, roughly, that if x is close to xo, and y is 
close to yo, then x + y will be close to x9 + yo, and xy will be close to xoyo, 
and 1/y will be close to 1/yo. This intuitive statement is much easier to remem- 
ber than the precise estimates of the lemma, and it is not unreasonable to read 
the proof of Theorem 2 first, in order to see just how these estimates are used. 


LEMMA 


PROOF 


(1) If 
lx — χοὶ < 5 and ly — yol <> 
then 
I(x + y) — (xo + yo)| < ε. 
(2) If 


Ix — χοὶ < min ( and |y — yo| < 


ξ 
Ί ἜΒΟΕ τ ΟΕ δον 
͵ 2([ yo + π) 
then 
Ixy — xoyol < &. 


(3) If yo ¥ 0 and 
ly {2 
ly - yo| < min (2 Sigel ) 
then y ¥ 0 and 


fe 
7 Yo 
(1) [ὰ ty) — (xo + yo)| = [ἰὰ — x0) + Ὁ — yo)| 


< |x — xl + ly — yo <5 + 


<— 5. 


(2) Since |x — xo| < 1 we have 


|x| — |xol < |x — xo] < 1, 
so that 
|x| < 1 + [χο]. 
Thus 
Ixy — xoyol = lx(y — yo) + yo(x — xo)| 
< [χ 1} — vol + lyol * |x — χοὶ 
Ϊ . ξ . 
< (1 + |xol) Deel ἢ + |yol 
ξ ξ 
< 5 ++ 5 = ξ. 
(3) We have 


yl -- lal < ly — aol < 2h 


so |y| > |yo|/2. In particular, y ¥ 0, and 
Le. Δ 


ly : Iyol 
Thus 


_lyo—xl 2 1. elyol? 


y yo! [»} 17] lxol |yol 2 


Limits 


Ἔα, 
2(\xo| + 1) 


ε 
— τες. 
2 


en. συ 
2(\yol + 1) 


.-- - .2- - = ¢, J 


83 


84 


Foundations 


THEOREM 2 


PROOF 


If lim f(x) = ἐ and lim g(x) = γι, then 
a ") lim (f+ g)(x) = 0+ πὶ 
(2) lim (f+ g)(x) = d+ m. 
Moreover, if m # 0, then 


(53) lim (*) (x) = 


The hypothesis means that for every € > 0 there are δὶ, 62 Ὁ such that, for 


all x, 
if 0 < |x — al < δι, then |/(x) — /| « ε, 
and if 0 < |x — αἱ « 6», then |g(x) — m| < «. 


This means (since, after all, ¢/2 is also a positive number) that there are 
δι, 62 > 0 such that, for all x, 


if 0 < |x — αἰ < 4), then |f(x) — | < = 
and if0 < |x — αἱ < δ., then lg(x) — ml « = 


Now let ὃ = min(éi, 62). If 0 < |x — a| < 6, then 0 < |x — al < δι and 
0 < |x — αἱ < δὲ are both true, so both 


ACS eae <= and |g(x) — σι <5 


are true. But by part (1) of the lemma this implies that |(f + g)(x) — 


(1 + m)| < ¢. This proves (1). 
To prove (2) we proceed similarly, after consulting part (2) of the lemma. 


If ¢ > 0 there are 61, 62 > Ὁ such that, for all x, 
ξ 
10 « |x --α < 61, then 1) — ἢ <min (1,5-—" Ὁ) 
ξ 
and 0 < |x — a| « ὃς, then |g(x) -- m| « ---------. 


Again let ὃ = min(6i, δι). If 0 < |x — αἱ < ὃ, then 


: ξ 8 
7) τι ἐ < min (1, ere) and | g(x) i γη] < 2(Π| +1) 


So, by the lemma, |(f- g)(x) — 1: m| « ¢, and this proves (2). 
Finally, if ¢ > 0 there is a 6 > 0 such that, for all x, 


πὶ mt. 


if 0 < jx — αἰ < ὃ, then |g(x) — m| < min (Ξ : 


Limits 85 


But according to part (3) of the lemma this means, first, that g(x) # 0, so 
(1/g)(x) makes sense, and second that | 


(2) - 2] <e 


Using Theorem 2 we can prove, trivially, such facts as 


This proves (3). ff 


αι Σ᾽ ΕΣ Χο: ἘΞ 74" 

eae Δ + 1 α" + 1 

without going through the laborious process of finding a 6, given an &. We 
must begin with 


lim 7 = 7, 
aa 
lim 1 = 1, 
xa 
lim x = a, 


but these are easy to prove directly. If we want to find the 6, however, the 
proof of Theorem 2 amounts to a prescription for doing this. Suppose, to take 
a simpler example, that we want to find a 6 such that, for all x, 


10 < |x — αἱ < ὃ, then |x? ++ χα -- (@?+a)| « ξε. 


Consulting the proof of Theorem 2(1), we see that we must first find 6, 
and 62 > 0 such that, for all x, 


if 0 < |x — αἱ < δι, then |x? — αἿ «Ξ 


and if 0 < |x ~ αἱ < 6s, then |x — αἱ < = 


Since we have already given proofs that lim x? = a? and lim x = a, we know 


qrI—a wa 


how to do this: 
ξ 


δ᾽ = min 


reser mer aera ore ’ 
2({a| + 1) 


ε 
bo 5 
Thus we can take 
ξ 
τὸ 0 
᾿2([αἹ - 172 


ὃ = min(6;, 62) = min ἢ min 


86 Foundations 


(a) 


(b) 
FIGURE 16 


FIGURE 17 


If a * 0, the same method can be used to find a 6 > 0 such that, for all x, 


ἄς. 
ε΄ 


if 0 < |x — αἱ < 6, then 


The proof of Theorem 2(3) shows that the second condition will follow if we 
find a 6 > 0 such that, for all x, 


2 4 
if 0 < |x — al < 6, then |x? — αἾ < min (Ξ g|a| ) 


2 2 
᾿ς (lal? eal 
min [ξ΄ ) 
2([α] + 1) 


Naturally, these complicated expressions for 6 can be simplified considerably, 
after they have been derived. 

One technical detail in the proof of Theorem 2 deserves some discussion. 
In order for lim f(x) to be defined it is, as we know, not necessary for f to be 


Ia 


defined at a, nor is it necessary for f to be defined at all points x # a. However, 
there must be some 6 > 0 such that f(x) is defined for x satisfying 0 < |x — αἱ 
< 6; otherwise the clause 


“if 0 < |x — αἱ < ὃ, then | f(x) --  « ς᾽ 


Thus we can take 


ὃ = min \ 1, 


would make no sense at all, since the symbol f(x) would make no sense for 
some x’s. If f and g are two functions for which the definition makes sense, it is 
easy to see that the same is true for f + g and /: g. But this is not so clear for 
1/g, since 1/g is undefined for x with g(x) = 0. However, this fact gets 
established in the proof of ‘Theorem 2(3). 

There are times when we would like to speak of the limit which f approaches 
at a, even though there is no 6 > 0 such that f(x) is defined for x satisfying 
0 < |x — αἱ < 6. For example, we want to distinguish the behavior of the two 
functions shown in Figure 16, even though they are not defined for numbers 
less than a. For the function of Figure 16(a) we write 

lim f(x) =/ or am 1) =. 
(The symbols on the left are read: the limit of f(x) as x approaches a from 
above). These ‘‘limits from above’’ are obviously closely related to ordinary 
limits, and the definition is very similar: lim f(x) = / means that for every 


€ > O there is a 6 > 0 such that, for all x, 
if0 <x —a « ὃ, then |f(x) -- { « ε. 


(The condition “Ὁ < x — a « 6” is equivalent to “Ὁ < |x -- αἱ < 6 and 
x > a.) 
“Limits from below” (Figure 17) are defined similarly: lim f(x) = ὦ 
2 a7 


Limits 87 


(or lim f(x) = ὃ means that for every ¢ > 0 there is a 6 > 0 such that, for 
zfa 


all x, 
10 <a—x « ὃ, then |f(x) --  « ς. 


It is quite possible to consider limits from above and below even if f is 
defined for numbers both greater and less than a. Thus, for the function f 
of Figure 13, we have 

lim f(x) = 1 and lim f(x) = —1. 
2 07 


x—0t 
It is an easy exercise (Problem 27) to show that lim f(x) exists if and only 
if lim f(x) and lim f(x) both exist and are equal. 


Like the definitions of limits from above and below, which have been 
smuggled into the text informally, there are other modifications of the limit 
concept which will be found useful. In Chapter 4 it was claimed that if x is 
large, then sin 1/x is close to 0. ‘This assertion is usually written 

lim sin 1/x = 0. 


τὺ 


FIGURE 18 


The symbol lim f(x) is read “πε limit of f(x) as x approaches ©,” or “as x 


1“ ὦ 


becomes infinite,” and a limit of the form lim f(x) is often called a limit at in- 
xr? ὦ 
finity. Figure 18 illustrates a general situation where lim f(x) = /. Formally, 


lim f(x) = / means that for every ¢ > 0 there is a number WN such that, for 


z—7 0 


all x, 


if x > N, then | f(x) —  « «. 


The analogy with the definition of ordinary limits should be clear: whereas 
the condition “Ὁ < |x — αἱ < 6” expresses the fact that x is close to a, the 
condition “‘x > N” expresses the fact that x 15 large. 

We have spent so little time on limits from above and below, and at infinity, 
because the general philosophy behind the definitions should be clear if you 
understand the definition of ordinary limits (which are by far the most impor- 
tant). Many exercises on these definitions are provided in the Problems, which 
also contain several other types of limits which are occasionally useful. 


88 Foundations 


PROBLEMS 


1. 


ἘΔ, 


Find the following limits. (These limits all follow, after some algebraic 
manipulations, from the various parts of Theorem 2; be sure you know 
which ones are used in each case, but don’t bother listing them.) 


ae | 
ἮΝ 
ti) er eneren 
x? — 8 
Ἢ ἢ 
tii) peewee 
3— 8 
(iii) lim ~ 
293 X — 2 
Gin  Ξεθ.. 
my x — y 
(v) im 
yor X — y WA 
ee Ν 
Cie 
h-0 h 


In each of the following cases, find a 6 such that |f(*) — Z| < ¢ for all 
x satisfying 0 < |x — αἰ « 6. 


(i) f(x) = x4, 2 = af. 
Gays 6 PS. 


δὲ [πὸ 


1 
(iii) f(x) = χ - τὰ ΞΞ 1,4 Ξ 2. 
cs 
: _ x . 
(iv) f(x) τὸ Ί ΜῈ Gin? ee 
(v) f(z) = Vel; = 0,0 = 0. 
(vi) f(x) = V x; a=i,/=1. 
For each of the functions in Problem 4-15, decide for which numbers a 
the limit lim f(x) exists. 


a=0,/=0. 


(a) Do the same for each of the functions in Problem 4-17. 

(b) Same problem, if we use infinite decimals ending in a string of 0’s 
instead of those ending in a string of 9’s. 

Suppose the functions f and g have the following property: for all ε > 0 

and all x, 


if 0 < |x — 2| < sin? (=) + ¢, then |f(x) — 2| < ε, 
if 0 < |x — 2] < e€?, then |g(x) — 4| « ε. 

For each € > 0 find a 6 > 0 such that, for all x, 

(i) if 0 < |x — 2] < 6, then |f(~) + g(x) -- 6] < ς. 


10. 


11, 


12. 


Limits 89 


(ii) if 0 < |x — 2| < ὃ, then |f(x)e(x) — 8! < «. 


(iii) 10 < |x — 2] < ὃ, then Penta ike 
(iv) 10 < |x — 2 < 6, then a s\<e 


Give an example of a function f for which the following assertion is 
false: If |f(x) — l| < € when 0 < |x — αἰ < 6, then | f(x) — J| « ¢/2 
when 0 < |x — αἰ < 6/2. 

(a) If lim f(x) and lim g(x) do not exist, can lim[f(x) + g(x)] or 


lim f(x)g(x) exist? 
(b) If lim f(x) exists and lim [f(x) + g(x)] exists, must lim g(x) exist? 
(c) If lim f(x) exists and lim g(x) does not exist, can lim [f(x) + g(x)] 


exist? 
(d) If lim ι 70) exists and ΠῚ f(x)g(x) exists, does it follow that lim g(x) 
ise 


Prove that lim f(x) = lim f(a + h). (This is mainly an exercise in under- 
ra h-0 


standing what the terms mean.) 
(a) Prove that lim f(x) = / if and only if lim[/f(x) — ἢ = 0. (First see 


why the assertion is obvious; then provide a rigorous proof. In this 
chapter most problems which ask for proofs should be treated in 
the same way.) 

(b) Prove that lim f(x) = lim f(x). 


(c) Prove that ia f(x) = lim f(x). 
. 2-0 20 
(d) Give an example where lim f(x?) exists, but lim f(x) does not. 
xz—0 20 


Suppose there is a 6 > 0 such that f(x) = g(x) when 0 < |x — al < 6. 
Prove that lim f(x) = lim g(x). In other words, lim f(x) depends only 


on the values of f(x) for x near a—this fact is often expressed by saying 
that limits are a “local property.”’ (It will clearly help to use 6’, or some 
other letter, instead of 6, in the definition of limits.) 

(a) Suppose that f(x) < g(x) for all x. Prove that lim f(x) < lim g(x), 


provided that these limits exist. 
(b) How can the hypotheses be weakened? 


(c) If f(x) < g(x) for all x, does it necessarily follow that lim f(x) < 
lim g(x) ? 


Suppose that f(x) < g(x) < A(x) and that lim f(x) = lim A(x). Prove 
that lim g(x) exists, and that lim g(x) = lim f(x) = lim h(x). (Draw a 


picture!) 


90 Foundations 


713. 


14. 


15. 


16. 


17: 


Ἐ18. 


19. 


20. 


aA I 


(a) Prove that if lim f(x)/x =1 and ὁ #0, then lim f(bx)/x = 


Hint: Write One = ἜΡΩΣ 

(Ὁ) What happens if ὁ = 0? 

(c) Part (a) enables us to find lim(sin 2x)/x in terms of lim(sin x)/x. 
x70 2-0 


Find this limit in another way. 
(a) Prove that if lim f(x) = 1, then lim fi(x) = Zl. 


(b) Prove that if lim f(x) = land lim 26 = m, then _ max(f, g)(x) 


= max(/, m) and similarly jean min. 
(a) Prove that lim 1/x does not exist, 1.6., show that lim 1/x = / is 
x0 2-0 
false for every number /. 
(b) Prove that lim 1/(x — 1) does not exist. 


Prove that if lim nf) = /, then there is a number 6 > 0 and a number 


M such that NGS <M if 0 < |x — a| < ὃ. (What does this mean 
pictorially?) Hint: Why does it suffice to prove that / — 1 < f(x) < 
ἰ - 1 for 0 < |x -- -ἰ <6? 

Prove that if f(x) = 0 for irrational x and f(x) = 1 for rational x, then 
mi f(x) does not exist for any a. 


eave that if f(x) = x for rational x, and f(x) = —-x for irrational x, 
then lim f(x) does not exist if a # 0. | 


(a) Prove that if lim g(x) = 0, then lim g(x) sin 1/x = 0. 
20 20 
(Ὁ) Generalize this fact as follows: If lim g(x) = Ὁ and |h(x)| < M for all 
“-Ὁ0 
x, then lim g(x)h(x) = 0. (Naturally it is unnecessary to do part (a) if 
20 


you succeed in doing part (b); actually the statement of part 
(b) may make it easier than (a)—that’s one of the values of 
generalization.) 
Consider a function f with the following property: if g is any func- 
tion for which ΠΕῚ g(x) does not exist, then lim| f(x) + g(x)] also does 


not exist. Prove oa this happens if and only if [πη f(x) does exist. Hint: 
This is actually very easy: the assumption that! fim 110 does not exist 


leads to an immediate contradiction if you ee Pe the right g. 

This problem is the analogue of Problem 20 when f + g 1s replaced by 
fg. In this case the situation is considerably more complex, and the 
analysis requires several steps (those in search of an especially challeng- 
ing problem can attempt an independent solution). 


(a) Suppose that lim f(x) exists and is ¥ 0. Prove that if lim g(x) does 
x0 20 


not exist, then lim f(x)g(x) also does not exist. 
2-0 


*22. 


22, 


24, 


25. 


*26. 


27. 
28. 


Limits 91 


(b) Prove the same result if lim | f(x)| = ο. (The precise definition of 
z—0 


this sort of limit is given in Problem 33.) 
(c) Prove that if neither of these two conditions holds, then there is a 
function g such that lim g(x) does not exist, but lim f(x)g(x) does 
20 20 


exist. Hint: Consider separately the following two cases: (1) For 
some € > 0 we have |/f(x)| > ¢€ for all sufficiently small x. (2) For 
every € > 0, there are arbitrarily small x with |/(x)| « ¢. In the 
second case, begin by choosing points x, with |x,| <1/n and 


[f(%n)| < 1/n. 


Suppose that A, is, for each natural number n, some finite set of numbers 


in [0,1], and that A, and A,, have no members in common if m ¥ n. 


Define f as follows: 
1/n, xin A, 
0, x not in A, for any n. 


f(x) = 


Prove that lim f(x) = 0 for all a in [0, 1]. 

Explain why the following definitions of lim f(x) = / are all correct: 
For every 6 > 0 there is an € > 0 such that, for all x, 

(i) if 0 < |x — αἱ < ς, then |f(x) — ἢ < 6. 

(ii) 10 < |x — αἱ < ¢, then |f(x) — ἢ < 6. 

(iii) if 0 < |x — αἰ < ε, then |f(x) — ἢ < 56. 

(iv) if 0 < |x — al < ¢€/10, then |/(x) —  « 6. 


Give examples to show that the following definitions of lim f(x) = /are 

not correct. 

(a) For all 6 > 0 there is an ¢ > 0 such that if 0 < |x — αἰ < 6, then 
f(x) --ἰἨ « ς. 

(0) For all ¢ > 0 there is a 6 > 0 such that if |f(x) —  « ς, then 
0 « |x -- αἰ < 6. 


For each of the functions in Problem 4-15 indicate for which numbers a 
the one-sided limits lim f(x) and lim f(x) exist. 


(a) Do the same for each of the functions in Problem 4-17. 

(b) Also consider what happens if decimals ending in 0’s are used 
instead of decimals ending in 9’s. 

Prove that lim f(x) exists if lim f(x) = lim f(x). 


Prove that 
@) Tim f(x) = lim f(—»). 
(i) lim f(\x|) = lim f(x). 


92 Foundations 


"29. 


*30. 


31. 
32. 


39. 


34. 


(iii) lim f(x?) = lim f(x). 


(These equations, and others like them, are open to several interpreta- 

tions. They might mean only that the two limits are equal if they both 

exist; or that if a certain one of the limits exists, the other also exists and 

is equal to it; or that if either limit exists, then the other exists and is 

equal to it. Decide for yourself which interpretations are suitable.) 

Suppose that lim f(x) < lim f(x). (Draw a picture to illustrate this 
2 a7 zat 


assertion.) Prove that there is some 6 > 0 such that f(x) < f(y) when- 
ever x <a <y and |x — αἰ < 6 and |y — αἱ < 6. Is the converse true? 
Prove that lim (anx” + τ" + ao)/(bmx™ + ° + + + bo) exists if and 


only if m > n. What is the limit when m = n? When m > n? Hint: The 


one easy limit is lim 1/x* = 0; do some algebra so that this is the only 
xm 


information you need. 
Prove that lim f(1/x) = lim f(x). 
x->0t xr 0 


Define “ lim f(x) = 1.” 


(a) Find lim (α,χ" + + °° + ao)/(Omx™ + °° + + dp). 
(b) Prove that lim f(x) = lim /f(—x). 
(c) Prove that lim f(1/x) = lim f(x). 

z—0- I — ὦ 


We define lim f(x) = © to mean that for all N there is a 6 > 0 such 


that, for all x, if 0 < |x — αἱ < 6, then f(x) > N. (Draw an appropriate 
picture!) 


(a) Show that lim 1/(« — 3)? = o. 
2-3 
(b) Prove that if f(x) > ¢ > 0 for all x, and lim g(x) = 0, then 


lim f(x) /g(x) = =, 


(a) Define lim f(x) = ~, lim. f(x) = ©, and lim f(x) = οο. (Or at 


least convince yourself that you could write down the definitions if 
you had the energy. How many other such symbols can you define?) 


i 


(b) Prove that lim 1/x = o, 
z—>Q* 


(c) Prove that lim f(x) = © if and only if lim f(1/x) = ©. 
2 Ot x—> ὦ 


CHAPTER CONTINUOUS FUNCTIONS 


If f is an arbitrary function, it is not necessarily true that 
lim f(x) = f(a). 
ee >a 
eA In fact, there are many ways this can fail to be true. For example, f might not 


even be defined at a, in which case the equation makes no sense (Figure 1). 
Again, lim f(x) might not exist (Figure 2). Finally, as illustrated in Figure 


3, even if f is defined at a and lim f(x) exists, the limit might not equal f(a). 


FIGURE 1 2a 
a a a 
(a) (b) (c) 
FIGURE 2 


We would like to regard all behavior of this type as abnormal and honor, 
with some complimentary designation, functions which do not exhibit such 
peculiarities. The term which has been adopted is “‘continuous.”’ Intuitively, 
a function f is continuous if the graph contains no breaks, jumps, or wild 
oscillations. Although this description will usually enable you to decide 
whether a function is continuous simply by looking at its graph (a skill well 
worth cultivating) it is easy to be fooled, and the precise definition 1s very 
important. 


DEFINITION The function f is continuous at a if 


lim f(x) = f(a). 


We will have no difficulty finding many examples of functions which are, 
or are not, continuous at some number a—every example involving limits 
provides an example about continuity, and Chapter 5 certainly provides 


Φ ae 
enough of these. 
The function f(x) = sin 1/x is not continuous at 0, because it is not even 


FIGURE 3 defined at 0, and the same is true of the function g(x) = x sin 1/x. On the 
other hand, if we are willing to extend the second of these functions, that is, if 


93 


94 Foundations 


we wish to define a new function G by 


δῶ τ 'βὼ tix ee ῦ 


a, x = 0, 


then the choice of a = G(0) can be made in such a way that G will be con- 
tinuous at 0—to do this we can (in fact, we must) define G(0) = 0 (Figure 4). 


This sort of extension is not possible for f; if we define 
in 1 0 
δ: tg χα, eS 
a, x = 0, 
then F will not be continuous at 0, no matter what a is, because lim f(x) does 
2-0 
(a) not exist. 
The function 
x, «x rational 
PO as 
0, x irrational 
is not continuous at a, if a ~ 0, since lim f(x) does not exist. However, 
xa 
. lim f(x) = 0 = f(0), so f is continuous at precisely one point, 0. 
x—0 
The functions f(x) = c, g(x) = x, and A(x) = x? are continuous at all 
numbers a, since 
lim f(x) = limc = ς = f(a), 
xa xa 
lim g(x) = lim « = a = g(a), 
στὰ xa 
(b) ᾿ : 
lim A(x) = lim x? = a? = h(a). 
FIGURE 4 Ia za 
Finally, consider the function 
0, x irrational 


f(x) = 


1/q, χ = p/q in lowest terms. 


In Chapter 5 we showed that lim f(x) = 0 for all a2. Since 0 = f(a) only when 


a is irrational, this function is continuous at a if a is irrational, but not if a is 
rational. 

It is even easier to give examples of continuity if we prove two simple 
theorems. 


THEOREM 1 If f and g are continuous at a, then 


(1) f + g is continuous at a, 
(2) +g is continuous at a. 


Moreover, if g(a) ¥ 0, then 


(3) 1/g is continuous at a. 


PROOF 


THEOREM 2 


PROOF 


Continuous Functions 95 


Since f and g are continuous at a, 


lim f(x) = f(a) and im g(x) = g(a). 


wa 


By Theorem 2(1) of Chapter 5 this implies that 
lim (f t+ g)() = Ha) + g(a) = (Γ + 8), 


ria 


which is just the assertion that f + g 1s continuous at a. The proofs of parts 
(2) and (3) are left to you. J 


Starting with the functions f(x) = ¢ and f(x) = x, which are continuous 
at a, for every a, we can use Theorem 1 to conclude that a function 


Cea by xt bor + + bo 
Big te ceca a ee a eG 


is continuous at every point in its domain. But it is harder to get much further 
than that. When we discuss the sine function in detail it will be easy to prove 
that sin is continuous at a for all a; let us assume this fact meanwhile. A func- 
tion like 

sin? x + x? + x‘ sin x 


ie ee eae ace 
sin2? x + 4x? sin? x 


f(x) = 


can now be proved continuous at every point in its domain. But we are 
still unable to prove the continuity of a function like f(x) = sin(x’); we 
obviously need a theorem about the composition of continuous functions. 
Before stating this theorem, the following point about the definition of con- 
tinuity is worth noting. If we translate the equation lim f(x) = f(a) according 


ra 


to the definition of limits, we obtain 


for every ¢ > 0 there is 6 > 0 such that, for all x, 
10 < |x — αἱ < ὃ, then | f(x) — f(a)| < €. 


But in this case, where the limit is f(a), the phrase 
0<\|x-—al « ὃ 
may be changed to the simpler condition 
Ix — αἱ < ὃ, 


since if x = a it is certainly true that | f(x) — f(a)| < &. 


If g is continuous at a, and f is continuous at g(a), then f og is continuous at a. 
(Notice that f is required to be continuous at g(a), not at a.) 
Let ¢ > 0. We wish to find a 6 > 0 such that for all x, 


if |x — αἰ < ὃ, then |(feg)(x) — (fo g)(a)| < ε, 
i.e, [f(g(x)) — f(g(@))| < ε. 


96 Foundations 


We first use continuity of f to estimate how close g(x) must be to g(a) in order 
for this inequality to hold. Since f is continuous at g(a), there is a 6’ > 0 such 
that for all y, 


(1) if |y — g(a)| < 6’, then |f(y) — f(g(a))| « ε. 
In particular, this means that 
(2) if |g(x) — σία)! < δ΄’, then |f(g(x)) — f(g(@))| « ε. 


We now use continuity of g to estimate how close x must be to a in order for 
the inequality | g(«) — g(a)| < 6’ to hold. The number δ΄ is a positive number 


just like any other positive number; we can therefore take 6’ as the ¢ (!) in the 


definition of continuity of g at a. We conclude that there is a 6 > 0 such that, 


for all x, 
(3) if |x — al < 6, then |g(x) — g(a)| < δ΄. 


Combining (2) and (3) we see that for all x, 
if |x — αἱ « ὃ, then | f(g(x)) — fle(a))| < ει} 


We can now reconsider the function 


xsin1/x, «+ 
0 x= 0. 


2 


f(x) = 


We have already noted that f is continuous at 0. A few applications of 
Theorems 1 and 2, together with the continuity of sin, show that f is also con- 
tinuous at a, for a ~ 0. Functions like f(x) = sin(x? + sin(« + sin?(x*))) 
should be equally easy for you to analyze. 

The few theorems of this chapter have all been related to continuity of 
functions at a single point, but the concept of continuity doesn’t begin to be 
really interesting until we focus our attention on functions which are continu- 
ous at all points of some interval. If f is continuous at x for all x in (a, δ), then f 
is called continuous on (a, ὁ). Continuity on a closed interval must be defined 
a little differently; a function f is called continuous on [a, ὁ] if 


(1) fis continuous at x for all x in (a, ὁ), 
(2) lim f(x) = f(a) and lim f(x) = f(6). 
zat xz b- 


Functions which are continuous on an interval are usually regarded as 
especially well behaved; indeed continuity might be specified as the first 
condition which a ‘‘reasonable’’ function ought to satisfy. A continuous func- 
tion is sometimes described, intuitively, as one whose graph can be drawn 
without lifting your pencil from the paper. Consideration of the function 


_ {[xsini/x, x ¥ 0 
fox) = {> ae 


THEOREM 3 


PROOF 


Continuous Functions 97 


shows that this description is a little too optimistic, but it is nevertheless true 
that there are many important results involving functions which are continu- 
ous on an interval. These theorems are generally much harder than the ones in 
this chapter, but there is a simple theorem which forms a bridge between the 
two kinds of results. The hypothesis of this theorem requires continuity at only 
a single point, but the conclusion describes the behavior of the function on 
some interval containing the point. Although this theorem is really a lemma 
for later arguments, it is included here as a preview of things to come. 


Suppose f is continuous at a, and f(a) > 0. Then there isa number 6 > 0 such 
that f(x) > 0 for all x satisfying |x — αἱ < 6. Similarly, if f(a) < 0, then there 
is a number 6 > 0 such that f(x) < 0 for all x satisfying |x — αἱ < 6. 


Consider the case f(a) > 0. Since fis continuous at a, if¢ > 0 there isa 6 > 0 
such that, for all x, 
if jx — αἱ « 6, then | f(x) — f(a)| < «. 
Since f(a) > 0 we can take f(a) as the ¢. Thus there is 6 > 0 so that for all x, 
if |x — αἱ < ὃ, then | f(x) — f(a)| < f(@), 


and this last inequality implies f(x) > 0. 
A similar proof can be given in the case f(a) < 0; take ¢ = —/f(a). Or one 
can apply the first case to the function —/. ἢ 


PROBLEMS 


1. For which of the following functions f is there a continuous function # 
with domain R such that F(x) = f(x) for all x in the domain of f? 


ὦ sa) ==> 

Gi) fle) = ἢ 

(iii) f(x) = 0, x irrational. 

(iv) f(x) = 1/9, x = p/q rational in lowest terms. 


nN 


At which points are the functions of Problems 4-15 and 4-17 continuous? 
3. (a) Suppose that f is a function satisfying | f(x)| < |x| for all x. Show that 
f is continuous at 0. (Notice that /(0) must equal 0.) 
(b) Give an example of such a function f which is not continuous at any 
a # 0). 
(c) Suppose that g is continuous at 0 and g(0) = 0, and |f(x)| S | g(x) |. 
Prove that f is continuous at 0. 
4, Give an example of a function f such that f is continuous nowhere, but 
| f| is continuous everywhere. 


08 Foundations 


10. 


11. 


Ἐ17; 


15. 


14, 


For each number a, find a function which is continuous at a, but not at 
any other points. 


(a) Find a function f which is discontinuous at 1, $, 4, . . . butcon- 
tinuous at all other points. 
(b) Find a function f which is discontinuous at 1, 4, 4,4, ... , and 


at 0, but continuous at all other points. 

Suppose that f satisfies f(x + y) = f(x) + f(y), and that f is continuous 

at 0. Prove that f is continuous at a for all a. 

Suppose that fis continuous at a and f(a) = 0. Prove that if a σέ 0, then 

f + @ is nonzero in some open interval containing a. 

(a) Suppose f is mot continuous at a. Prove that for some number ¢€ > 0 
there are numbers x arbitrarily close to a with |f(x) — f(a)| > «. 
Illustrate graphically. 

(Ὁ) Conclude that for some number ¢ > 0 ezther there are numbers x 
arbitrarily close to a with f(x) < f(a) — ξ or there are numbers x 
arbitrarily close to a with f(x) > f(a) + 6. 

(a) Prove that if f is continuous at a, then so is [ἢ 

(b) Prove that every continuous f can be written f = .Ε + O, where Fis 
even and continuous and O is odd and continuous. 

(c) Prove that if f and g are continuous, then so are max(f, g) and 
min(f, g). 

(d) Prove that every continuous / can be written f = g — A, where g and 
h are nonnegative and continuous. 

Prove Theorem 1(3) by using Theorem 2 and continuity of the function 

Fico wees. 


(a) Prove that if fis continuous at /and lim g(x) = /, then lim f(g(x)) ει 


f(2). (You can go right back to the definitions, but it is easier to con- 

sider the function G with G(x) = g(x) for x # a, and G(a) = 1.) 
(b) Show that if continuity of f at / is not assumed, then it is not gener- 

ally true that lim f(g(x)) = f(lim g(x)). Hint: Try f(x) = 0 for 


x dl, and f(/) = 1. 

Prove that if f is continuous on [a, ὁ], then there is a function g which 

is continuous on R, and which satisfies g(x) = f(x) for all x in [a, 6]. 

Hint: Since you obviously have a great deal of choice, try making g 

constant on (— ©, a] and [b, ~). 

(b) Give an example to show that this assertion is false if [a, 6] is replaced 

by (a, ὁ). 

Suppose that g and ᾧ are continuous at a, and that g(a) = h(a). 

Define f(x) to be g(x) if x > a and h(x) if x < a. Prove that f is con- 

tinuous at a. 

(b) Suppose g is continuous on [a, δ] and ἡ is continuous on [4, c] and 
σ( ὁ) = h(b). Let f(x) be g(x) for x in ἴα, 6] and A(x) for x in [ὁ, c]. 
Show that f is continuous on [a, ὁ]. (Thus, continuous functions can 
be “‘pasted together.”’) 


Ne 


(a 


eee” 


(a 


15. 


16. 


(4) 


(b) 


Continuous Functions 99 


Prove the following version of Theorem 3 for “right-hand conti- 
nuity’’: Suppose that lim f(x) = f(a), and f(a) > 0. Then there is a 


number 6 > 0 such that f(x) > 0 for all x satisfying0 <x —a < 6. 
Similarly, if f(a) « 0, then there is a number 6 > 0 such that 
f(x) < 0 for all x satisfying 0 < x — a < 6. 

Prove a version of Theorem 3 when lim f(x) = fd). 


If lim f(x) exists, but is σέ f(a), then f is said to have a removable 


2a 


discontinuity at a. 


(a) 
(b) 


(c) 


*(d) 


**(e) 


If f(x) = sin 1/x for x 4 0 and f(0) = 1, does f have a removable 
discontinuity at 0? What if f(x) = x sin 1/x for x # 0, and f(0) = 1? 
Suppose f has a removable discontinuity at a. Let g(x) = f(x) for 
x #a, and let g(a) = lim f(x). Prove that g is continuous at a. 


(Don’t work very hard; this is quite easy.) 
Let f(x) = 0 if x is irrational, and let f(p/q) = 1/9 if p/q is in lowest 
terms. What is the function g defined by g(x) = lim f(y)? 

yor 


Let f be a function with the property that every point of discontinu- 
ity is a removable discontinuity. This means that lim f(y) exists for 
yx 


all x, but f may be discontinuous at some (even infinitely many) 
numbers x. Define g(x) = lim f(y). Prove that g is continuous. (This 
yor 


is not quite so easy as part (b).) 

Is there a function f which is discontinuous at every point, and which 
has only removable discontinuities? (It is worth thinking about this 
problem now, but mainly as a test of intuition; even if you suspect 
the correct answer, you will almost certainly be unable to prove it 
at the present time. See Problem 21-24.) 


CHAPTER 


THEOREM 1 


THEOREM 2 


THEOREM 3 


FIGURE 1 


THREE HARD THEOREMS 


This chapter is devoted to three theorems about continuous functions, and 
some of their consequences. The proofs of the three theorems themselves will 
not be given until the next chapter, for reasons which are explained at the end 
of this chapter. 


If f is continuous on [a, 6] and f(a) < 0 < f(é), then there is some x in [a, 6] 
such that f(x) = 0. 


(Geometrically, this means that the graph of a continuous function which 
starts below the horizontal axis and ends above it must cross this axis at some 
point, as in Figure 1.) 


If f is continuous on [a, 6], then f is bounded above on [a, 4], that is, there is 
some number WN such that f(x) < MN for all x in [a, 6]. 


(Geometrically, this theorem means that the graph of f lies below some line 
parallel to the horizontal axis, as in Figure 2). 


If f is continuous on [a, 6], then there is some number y in [a, δ] such that 
f(y) = f(x) for all x in [a, 6] (Figure 3). 


These three theorems differ markedly from the theorems of Chapter 6. The 
hypotheses of those theorems always involved continuity at a single point, 


FIGURE 2 FIGURE 3 
100 


FIGURE 4 


FIGURE 5 


FIGURE 6 


THEOREM 4 


Three Hard Theorems 101 


while the hypotheses of the present theorems require continuity on a whole 
interval [α, 6]—if continuity fails to hold at a single point, the conclusions 
may fail. For example, let f be the function shown in Figure 4, ~ 


fix) = S tee 
: V2 ee OD: 
Then f is continuous at every point of [0, 2] except V2, and f(0) < 0 < f(2), 
but there is no point x in [0, 2] such that f(x) = 0; the discontinuity at the 
single point V2 is sufficient to destroy the conclusion of Theorem 1. 
Similarly, suppose that f is the function shown in Figure 5, 


«Pipe. ee 8 
fs) = | 6 x= 0. 


Then f is continuous at every point of [0, 1] except 0, but f is not bounded 
above on [0, 1]. In fact for any number N > 0, we have fay) =2N > N. 
This example also shows that the closed interval [a, ὁ} in Theorem 2 cannot 
be replaced by the open interval (a, δ), for the function f 18 continuous on (0, 1), 
but is not bounded there. 
Finally, consider the function shown in Figure 6, 


ae a a | 
ΠΟ cee ee 


On the interval [0, 1] the function f is bounded above, so f does satisfy the 
conclusion of Theorem 2, even though f is not continuous on {0, 1]. But f does 
not satisfy the conclusion of Theorem 3—there is no y in [0, 1] such that 
f(y) > f(x) for all x in [0, 1]; in fact, it is certainly not true that f(1) = f(x) for 
all x in [0, 1] so we cannot choose y = 1, nor can we choose 0 < y < 1 because 
fly) < f(x) if x is any number with y < χ < 1. 

This example shows that Theorem 3 is considerably stronger than Theorem 
2. Theorem 3 is often paraphrased by saying that a continuous function on 
a closed interval “‘takes on its maximum value” on that interval. 

As a compensation for the stringency of the hypotheses of our three theorems, 
the conclusions are of a totally different order than those of previous theorems. 
They describe the behavior of a function, not just near a point, but ona whole 
interval; such “‘global’’ properties of a function are always significantly more 
difficult to prove than ‘‘local”’ properties, and are correspondingly of much 
greater power. To illustrate the usefulness of Theorems 1, 2, and 3, we will 
soon deduce some important consequences, but it will help to first mention 
some simple generalizations of these theorems. 


If f is continuous on [a, ὁ] and f(a) < ¢ < f(é), then there is some x in [a, δ] 
such that f(x) = c. | 


102 Foundations 


PROOF 


THEOREM 5 


PROOF 


THEOREM 6 


PROOF 


THEOREM 7 


PROOF 


THEOREM 8 


Let g = f —c. Then g is continuous, and g(a) < 0 < g(b). By Theorem 1, 
there is some x in [a, ὁ] such that g(x) = 0. But this means that f(x) = c. J 


If f is continuous on [a, ὁ] and f(a) > ¢ > f(b), then there is some x in [a, 6] 
such that f(x) = c. 


The function -- is continuous on [a, ὁ] and —f(a) < —c < —f(d). By 
Theorem 4 there is some x in [a, 6] such that —f(x) = —c, which means 
that f(x) = c.f 


Theorems 4 and 5 together show that f takes on any value between f(a) and 
f(8). We can do even better than this: if c and d are in [a, 6], then f takes on 
any value between f(c) and f(d). The proof is simple; if, for example, c « d, 
then just apply Theorems 4 and 5 to the interval {c, d]. Summarizing, if a 
continuous function on an interval takes on two values, it takes on every value 
in between; this slight generalization of Theorem 1 is often called the Inter- 
mediate Value Theorem. 


If f is continuous on [a, 6], then f is bounded below on [a, 6], that is, there is 
some number WN such that f(x) > N for all x in [a, ὁ]. 


The function -- is continuous on [a, 6], so by Theorem 2 there is a number M 
such that —f(x) < M for all x in [a, δ]. But this means that f(x) > — M for all 
x in [a, 6], so we can let N = —M.J 


Theorems 2 and 6 together show that a continuous function f on [a, ὁ] is 
bounded on [a, 4], that is, there is a number N such that | f(x)| < N for all xin 
[a, 6]. In fact, since Theorem 2 ensures the existence of a number Λ΄; such that 
f(x) < N;, for all x in [a, 6], and Theorem 6 ensures the existence of a number 
Ny» such that f(x) > N» for all x in [a, δ], we can take N = max(|Nil, |N2/). 


If f is continuous on [a, δ], then there is some y in [a, δ] such that f(y) < 2.) 
for all x in [a, ὁ]. 

(A continuous function on a closed interval takes on its minimum value on 
that interval.) 


The function -- is continuous on [a, 6]; by Theorem 3 there is some y in [a, ὁ] 
such that —f(y) => —f(x) for all x in [a, 6], which means that f(y) < f(x) for 
all x in [a, |. J 


Now that we have derived the trivial consequences of Theorems 1, 2, and 3, 
we can begin proving a few interesting things. 


Every positive number has a square root. In other words, if a > 0, then there 
is some number x such that x? = a. 


FIGURE 7 


PROOF 


THEOREM 9 


PROOF 


Three Hard Theorems 103 


Consider the function f(x) = x?, which is certainly continuous. Notice that 
the statement of the theorem can be expressed in terms of f: ‘‘the number ἃ 
has a square root” means that f takes on the value a. The proof of this fact 
about f will be an easy consequence of Theorem 4. 

There is obviously a number 6 > 0 such that f(b) > @ (as illustrated in 
Figure 7); in fact, if a > 1 we can take 6 = a, while if a < 1 we can take 
ὦ = 1. Since f(0) < α < f(b), Theorem 4 applied to [0, 4] implies that for 


some x (in [Q, 6]), we have f(x) = α, Le, x? = α. ἢ 


Precisely the same argument can be used to prove that a positive number has 
an nth root, for any natural number n. If n happens to be odd, one can do 
better: every number has an nth root. To prove this we just note that if the posi- 
tive number ἃ has the nth root x, i.e., if x” = a, then (—x)” = —a (since n is 
odd), so —a@ has the nth root —x. The assertion, that for odd n any number ἃ 
has an nth root, is equivalent to the statement that the equation 


nr 


x" —a=Q0 


has a root if n is odd. Expressed in this way the result is susceptible of great 
generalization. 


If n is odd, then any equation 


RO lie a? gg = Ὁ 


has a root. 


We obviously want to consider the function 


we would like to prove that f is sometimes positive and sometimes negative. 

The intuitive idea is that for large |x|, the function is very much like g(x) = x” 

and, since n is odd, this function is positive for large positive x and negative for 

large negative x. A little algebra is all we need to make this intuitive idea work. 
The proper analysis of the function f depends on writing 


fx) =x” fag 1 ἘΠ τ τ ἀρ Ξξ “(1 vest pei +“) 
Note that 
= “ΞΕ τ. 42) < etl Ὸ 3 +a 
Consequently, if we choose x satisfying 
(ΚΑ) Teel 1, 2η|α,...], .. ny 2π]αρὶ, 
then |x*} > |x| and 
lan—z| _ |an—s| |a@n—el _ 1 


[χ᾽] Ix] 2nlan—z| 2η᾽ 


104 Foundations 


FIGURE 8 


80 
ἄ,..1Δ, 4,..9 ao 1 1 1 
: ἘΠ ὩΣ ξεν τω Τὰ es eae 
x ε ᾿ Bi εἰ Χ͵Ή. 2n + | a 2n 2 
n terms 
In other words, 
1 aAn—t1 ἄρ 1 
ως , BO vara 
2 χ a as ye 2 


which implies that 


Be pe et og oh 
2 x 


Therefore, if we choose an x1 > 0 which satisfies (*), then 


<n (1 + =! + τ ἘΞ) = f(x»), 


x1 


80 that f(x1) > 0. On the other hand, if x. < 0 satisfies (*), then x.” < 0 and 


eat (1 53. πος ἘΞ) = 1(..), 
2 ἢ Χο Χο 
so that f(x2) < 0. 

Now applying Theorem 1 to the interval [x2, x1] we conclude that there is 


an x in [x,, x] such that f(x) = 0. J 


Theorem 9 disposes of the problem of odd degree equations so happily that 
it would be frustrating to leave the problem of even degree equations com- 
pletely undiscussed. At first sight, however, the problem seems insuperable. 
Some equations, like x? - 1 = 0, havea solution, and some, like x? + 1 = 0, 
do not--what more is there to say? If we are willing to consider a more general 
question, however, something interesting can be said. Instead of trying to solve 


the equation | 
x” ta, x ΤΠ ++ +a = 0, 


let us ask about the possibility of solving the equations 
x” + an yx™ 1- Bebe +a=c 


for all possible numbers c. This amounts to allowing the constant term ap to 
vary. The information which can be given concerning the solution of these 
equations depends on a fact which is illustrated in Figure 8. 

The graph of the function f(x) = x” + adnix™”*+ + + + + ao, with neven, 
contains, at least the way we have drawn it, a lowest point. In other words, 
there is a number y such that f(y) < f(x) for all numbers x—the function f 
takes on a minimum value, not just on each closed interval, but on the whole 
line. (Notice that this is false if n is odd.) The proof depends on Theorem 7, but 
a tricky application will be required. We can apply Theorem 7 to any interval 


Three Hard Theorems 105 


[a, 6], and obtain a point yo such that f(yo) is the minimum value of f on [a, δ]; 
but if [a, 6] happens to be the interval shown in Figure 8, for example, then 
the point yo will not be the place where f has its minimum value for the whole 
line. In the next theorem the entire point of the proof is to choose an interval 
(2, ὁ] in such a way that this cannot happen. 


THEOREM 10 If nis even and f(x) = x” + a,_ix"! + "τ + ao, then there is a number y 
such that f(y) < f(x) for all x. 


PROOF As in the proof of Theorem 9, if 
| M = max(1, 2n|an_i|, . . . , 2nlao|), 


then for all x with |x| > 4, we have 


Since n is even, x” > 0 for all x, so 
ἐς μ( μόπεα ον δὴ =s00 
Ζ x x 


provided that |x| > M. Now consider the number f(0). Let ὃ.» 0 be a number 
such that 6” > 27(0) and also ὁ > M. Then, if x > ὁ, we have (Figure 9) 


fla) > 2 


Similarly, if x < —d, then 


—b 


FIGURE 9 


Summarizing: 
ifx > borx < —b, then f(x) > f(0). 


Now apply Theorem 7 to the function f on the interval [—4, δ]. We con- 
clude that there is a number y such that 


(1) if —b <x <b, then f(y) < f(). 
In particular, f(y) < f(0). Thus 
(2) ifx < —borx > ὁ, then f(x) > f(0) Σ fy). 
Combining (1) and (2) we see that f(y) < f(x) for all x. 


Theorem 10 now allows us to prove the following result. 


THEOREM 11 Consider the equation 


CR): xe Pagan Pa oe AP ao, 


106 Foundations 


and suppose n is even. Then there is a number m such that (*) has a solution 
for c > m and has no solution for c < m. 


PROOF Let f(x) = x” + On 1x"? +--+ + + ao (Figure 10). 


According to Theorem 10 there is a number y such that f(y) < f(x) for all 
x. Let m = f(y). If ¢ < m, then the equation (*) obviously has no solution, 
since the left side always has a value > γι. If c = m, then (*) has y as a solu- 
tion. Finally, suppose c > m. Let 6 be a number such that ὁ > y and f(d) > ὃ. 
Then f(y) = m < ς < f(b). Consequently, by Theorem 4, there is some num- 
ber x in [y, 6] such that f(x) = c, so x is a solution of (+). J 


These consequences of Theorems 1, 2, and 3 are the only ones we will derive 

10) now (these theorems will play a fundamental role in everything we do later, 

however). Only one task remains—to prove Theorems 1, 2, and 3. Unfortu- 

nately, we cannot hope to do this—on the basis of our present knowledge 

about the real numbers (namely, P1-P12) a proof is impossible. There are 

ὲ several ways of convincing ourselves that this gloomy conclusion is actually the 

case. For example, the proof of Theorem 8 relies only on the proof of Theorem 

1; if we could prove Theorem 1, then the proof of Theorem 8 would be com- 

plete, and we would have a proof that every positive number has a square root. 

As pointed out in Part I, it is impossible to prove this on the basis of P1-P12. 
Again, suppose we consider the function 


Ί 


Te 


FIGURE 10 f(x) = 


If there were no number x with x? = 2, then f would be continuous, since the 
denominator would never = 0. But f is not bounded on [0, 2]. So Theorem 2 
depends essentially on the existence of numbers other than rational numbers, 
and therefore on some property of the reals other than P1-—P12. 

Despite our inability to prove Theorems 1, 2, and 3, they are certainly 
results which we want to be true. If the pictures we have been drawing have 
any connection with the mathematics we are doing, if our notion of continuous 
function corresponds to any degree with our intuitive notion, Theorems 1, 2, 
and 3 have got to be true. Since a proof of any of these theorems must require 
some new property of R which has so far been overlooked, our present diffi- 
culties suggest a way to discover that property: let us try to construct a 
proof of Theorem 1, for example, and see what goes wrong. 

One idea which seems promising is to locate the first point where f(x) = 0, 
nS Cre that is, the smallest x in [a, ὁ] such that f(x) = 0. To find this point, first con- 

sider the set A which contains all numbers «x in [a, 6] such that f is negative on 
la, x]. In Figure 11, x is such a point, while x’ is not. The set A itself is indicated 
by a heavy line. Since f is negative at a, and positive at ὁ, the set A contains 
some points greater than a, while all points sufficiently close to ὁ are not in A. 
(We are here using the continuity of f on [a, 5], as well as Problem 6-15.) 


τς f(x) « 0 for all x in this interval 


A would also contain 
all these points 


FIGURE 12 


A could really be 
only this big 


A 


f(x) > 0 for all x 


in this interval 


FIGURE 13 


Three Hard Theorems 107 


Now suppose α is the smallest number which is greater than all members of 
A; clearly a < a < 6. We claim that f (a) = 0, and to prove this we only have 
to eliminate the possibilities f(a) < 0 and f(a) > 0. 

Suppose first that f(a) < 0. Then, by Theorem 6-3, f(x) would be less than 
0 for all x in a small interval containing a, in particular for some numbers 
bigger than a (Figure 12); but this contradicts the fact that ἃ is bigger than 
every member of A, since the larger numbers would also be in A. Consequently, 
f(a) < 0 is false. 

On the other hand, suppose f(a) > 0. Again applying Theorem 6-3, we 
see that f(x) would be positive for all x in a small interval containing ἃ, in 
particular for some numbers smaller than ἃ (Figure 13). This means that these 
smaller numbers are all nof in A. Consequently, one could have chosen an 
even smaller ἃ which would be greater than all members of A. Once again we 
have a contradiction; f(a) > 0 is also false. Hence f(a) = 0 and, we are 
tempted to say, Q.E.D. 

We know, however, that something must be wrong, since no new properties 
or R were ever used, and it does not require much scrutiny to find the dubious 
point. It is clear that we can choose a number a@ which is greater than all 
members of A (for example, we can choose a = 6), but it is not so clear that 
we can choose a smallest one. In fact, suppose A consists of all numbers x 2 0 
such that x? < 2. If the number V2 did not exist, there would not be a least 


number greater than all the members of A; for any y > V2 we chose, we could 
always choose a still smaller one. 

Now that we have discovered the fallacy, it is almost obvious what addi- 
tional property of the real numbers we need. All we must do is say it properly 
and use it. That is the business of the next chapter. 


PROBLEMS 


1. For each of the following functions, decide which are bounded above or 
below on the indicated interval, and which take on their maximum or 
minimum value. (Notice that f might have these properties even if f is not 
continuous, and even if the interval is not a closed interval.) 


Gy. §G) ΞΞ ἃ on C1, bh): 
Gi). f@) =x -on-(—1, 1). 
(iii) f(x) = x? on R. 

(iv) f(x) = x? on [0, 2). 
x<a 


ad PES 


sary to consider several possibilities for a.) 


on (—a — 1, a+ 1). (It will be neces- 


x x<a 
= |*> ~4 = [ἢ 
wi) f= {TSS onl-a- , 4 Ὁ 1] 
ἰ 6 { 9, x irrational 
way ΟΞ | 1/q, χ = p/q in lowest terms ont 


108 Foundations 


δ; Ί x Irrational 
= ; 0, 1]. 
val) 7) 1/q, x = p/q in lowest terms oa 
1 x Irrational 
= ; 0, 1]. 
ey {00 —1/q, χ = p/q in lowest terms onl 


x, x rational 
a) Ie) es x irrational 
(xi) f(x) = sin%(cos x + V1 + a2) on (0, a3}. 
(xii) f(x) = [x] on [0, a]. 
For each of the following polynomial functions f, find an integer n such 
that f(x) = 0 for some x between n and n + 1. 


Gi) f(x) = x8 —x% 4+ 5. 

(ii) f(x) = χ' + 5x4 + 2x 4-1. 

(iii) f(x) = x8 -Ἐ χα - 1. 

(iv) f(x) = 4x? — 4x 4+ 1, 

Prove that there is some number χα such that 


| 163 
. 179 pe 1190. 
Bae a a 


(ii) sinx =x — 1. 


on [0, a]. 


This problem is a continuation of Problem 3-7. 


(a) If n — k is even, and >0, find a polynomial function of degree n 
with exactly k roots. 

(b) A root a of the polynomial function ἢ is said to have multiplicity m 
if f(x) = (x — a)™g(x), where g is a polynomial function that does 
not have a as a root. Let f be a polynomial function of degree n. 
Suppose that f has & roots, counting multiplicities, i.e., suppose that 
k is the sum of the multiplicities of all the roots. Show that n — k is 
even. 


Suppose that f is continuous on [a, 6] and that f(x) is always rational. 
What can be said about f ? 
Suppose that fis a continuous functionon [—1,1]such that x? + (f(x))? = 1 
for all x. (This means that (x, f(x)) always lies on the unit circle.) Show 
that either f(x) = V1 — x? forall x, or else f(x) = — V1 -- x? for all x. 
How many continuous functions f are there which satisfy (f(x))? = x? 
for all x ? 
Suppose that f and g are continuous, that f? = g?, and that f(x) σέ 0 for 
all x. Prove that either f(x) = g(x) for all x, or else f(x) = —g(x) for all x. 
(a) Suppose that f is continuous, that f(x) = 0 only for x = a, and that 
f(x) > 0 for some x > a as well as for some x < a. What can be said 
about f(x) for all x σέ a? 


*(b) Using only the fact that x? + xy + y? ~ 0 if x and y are not both 0, 


show that x? + xy + y? > 0 if x and y are not both 0. (The trick is 
to consider various fixed y ¥ 0, so that you can define a function. ) 


10. 


(, 1) 


FIGURE 14 


13. 


14. 


FIGURE 15 


7A]. 


*18. 


11. 


*15, 


*16. 


Three Hard Theorems 109 


*(c) Discuss similarly the sign of x? + x?y + xy? + γ᾽ when x and y are 
not both 0. 

Suppose f and g are continuous on [a, ὁ] and that f(a) < g(a), but 

f(b) > g(b). Prove that f(x) = g(x) for some x in [a, 6]. (If your proof 

isn’t very short, it’s not the right one.) 

Suppose that f is a continuous function on [0, 1] and that f(x) is in [0, 1] 

for each x (draw a picture). Prove that f(x) = x for some number x. 

(a) Problem 11 shows that f intersects the diagonal of the square in 
Figure 14 (solid line). Show that f must also intersect the other 
(dashed) diagonal. 

(b) Prove the following more general fact: If g is continuous on [0, 1] 
and g(0) = 0, g(1) = 1 or g(0) = 1, g(1) = 0, then f(x) = g(x) for 
some x. 

(a) Let f(x) = sin 1/x for x ¥ 0 and let f(0) = 0. Is f continuous on 
[—1, 1] ? Show that f satisfies the conclusion of the Intermediate 
Value Theorem on [—1, 1]; in other words, if f takes on two values 
somewhere on [—1, 1], it also takes on every value in between. 

*(b) Suppose that f satisfies the conclusion of the Intermediate Value 
Theorem, and that f takes on each value only once. Prove that f is 
continuous. 

*(c) Generalize to the case where f takes on each value only finitely many 
times. 

If f is a continuous function on [0, 1], let || /|] be the maximum value 


of |f| on [0, 1]. 


(a) Prove that for any number c we have [66 = |c| - Π}]. 

*(b) Prove that | f + σὶ < [΄ + ||gi. Give an example where 
Wf + gil # WA + ligll. 

(c) Prove that ||4 — fl < [4 — gall + lle — fl. 


Suppose that ¢ is continuous and lim ¢(x)/x" = 0 = lim @(x)/x”. 
(a) Prove thatif nis odd, then there isa number x such that x” + (x) = 0. 
(b) Prove that if n is even, then there is a number y such that y” + 


bly) <x" + G(x) for all x. 


Hint: Of which proofs does this problem test your understanding? 
Let f be any polynomial function. Prove that there is some number y 
such that |f(y)| < |f(x)} for all x. 
Suppose that f is a continuous function with f(x) > 0 for all x, and 
lim f(x) = 0 = lim f(x). (Draw a picture.) Prove that there is some 


qr © 


number y such that f(y) > f(x) for all x. 

(a) Suppose that f is continuous on [a, 6], and let x be any number. 
Prove that there is a point on the graph of f which is closest to (x, 0); 
in other words there is some y in [a, ὁ] such that the distance from 
(x, 0) to (y, f(y)) is < distance from (x, 0) to (z, f(z)) for all z in 
[a, 6]. (See Figure 15.) 


110 Foundations 


(b) Show that this same assertion is not necessarily true if [a, ὁ] is 
replaced by (a, 6) throughout. 
(c) Show that the assertion ἐς true if [a, δ] is replaced by R throughout. 
(d) In cases (a) and (c), let g(x) be the minimum distance from (x, 0) to 
a point on the graph of f. Prove that g(y) < g(x) + |x — yl, and 
conclude that g is continuous. 
(e) Prove that there are numbers x, and x in [a, δ] such that the distance 
from (xo, 0) to (x1, f(xi)) is < the distance from (x ’, 0) to (x1’, f(x1’)) 
for any xo, αι in [a, 8]. 
Suppose that f is continuous on [0, 1] and f(0) = f(1). Let 2 be any 
natural number. Prove that there is some number x such that 
f(x) = f(x + 1/n), as shown in Figure 16 for n = 4. Hint: Consider 
the function g(x) = f(x) — f(x + 1/n); what would be true if 
j g(x) ¥ 0 for all x ἢ 
(b) Suppose 0 < a < 1, but that @ is not equal to 1/n for any natural 
i | number π. Find a function f which is continuous on [0, 1] and which 
ae satisfies f(0) = f(1), but which does not satisfy f(x) = f(x + a) for 
xx+4 1 any x. 
**20. (a) Prove that there does not exist a continuous function f defined on 
R which takes on every value exactly twice. Hint: If f(a) = f(®) for 
a <b, then either f(x) > f(a) for all x in (a, 6) or f(x) < f(a) for 
all x in (a, δ). Why? In the first case all values close to f(a), but 
slightly larger than f(a), are taken on somewhere in (a, 6); this 
implies that f(x) < f(a) forx < aandwx> ὁ. 

(b) Refine part (a) by proving that there is no continuous function f 
which takes on each value either Ὁ times or 2 times, i.e., which 
takes on exactly twice each value that it does take on. Hint: The 
previous hint implies that f has either a maximum or a minimum 
value (which must be taken on twice). What can be said about 
values close to the maximum value? 

(c) Find a continuous function f which takes on every value exactly 
3 times. More generally, find one which takes on every value exactly 
n times, if n is odd. | 

(d) Prove that if n is even, then there is no continuous f which takes on 
every value exactly n times. Hint: To treat the case n = 4, for 
example, let f(x1) = f(xe) = f(xs) = f(x4). Then either f(x) > 0 for 
all x in two of the three intervals (x1, x2), (x2, x3), (ἃς, x4), or else 
f(x) < 0 for all x in two of these three intervals. 


TLOs 4 


Neer” 


” 


FIGURE 16 


CHAPTER 


DEFINITION 


DEFINITION 


LEAST UPPER BOUNDS 


This chapter reveals the most important property of the real numbers. Never- 
theless, it is merely a sequel to Chapter 7; the path which must be followed 
has already been indicated, and further discussion would be useless delay. 


A set A of real numbers is bounded above if there is a number x such that 


x >a_ for every ain A. 


Such a number x is called an upper bound for A. 


Obviously A is bounded above if and only if there is a number x which 15 an 
upper bound for A (and in this case there will be lots of upper bounds for A); 
we often say, as a concession to idiomatic English, that “A has an upper 
bound’? when we mean that there is a number which is an upper bound for A. 

Notice that the term ‘‘bounded above” has now been used in two ways— 
first, in Chapter 7, in reference to functions, and now in reference to sets. ‘This 
dual usage should cause no confusion, since it will always be clear whether 
we are talking about a set of numbers or a function. Moreover, the two defini- 
tions are closely connected: if A is the set { f(x): a < x < δ), then the function 
f is bounded above on [a, ὁ] if and only if the set A is bounded above. 

The entire collection R of real numbers, and the natural numbers N, are 
both examples of sets which are not bounded above. An example of a set which 
1s bounded above is 


A= {x0 <= 4 <1: 


To show that A is bounded above we need only name some upper bound for 
A, which is easy enough; for example, 138 is an upper bound for A, and so are 
2, 14, 14, and 1. Clearly, 1 is the least upper bound of A; although the phrase 
just introduced is self-explanatory, in order to avoid any possible confusion 
(in particular, to ensure that we all know what the superlative of “‘less” 
means), we define this explicitly. 


A number x is a least upper bound of A if 


(1) χ is an upper bound of 4, 
and (2) if y is an upper bound of 4, then x < y. 


112 Foundations 


The use of the indefinite article “δ᾽ in this definition was merely a conces- 
sion to temporary ignorance. Now that we have made a precise definition, it is 
easily seen that if x and y are both least upper bounds of A, then x = y. Indeed, 
in this case 


x <y, since y is an upper bound, and x is a least upper bound, 


and y < x, since x is an upper bound, and y is a least upper bound; 


it follows that x = y. For this reason we speak of the least upper bound of 4. 
The term supremum of 4 is synonymous and has one advantage. It abbrevi- 
ates quite nicely to 


sup A (pronounced ‘‘soup A’’) 
and saves us from the abbreviation 
lub A 


(which is nevertheless used by some authors). 

There is a series of important definitions, analogous to those just given, 
which can now be treated more briefly. A set A of real numbers is bounded 
below if there is a number x such that 


x<a for every a in A. 


Such a number «x is called a lower bound for A. A number χα is the greatest 
lower bound of 4 if 


(1) x is a lower bound of A, 
and (2) if y is a lower bound of A, then x > y. 


The greatest lower bound of 4 is also called the infimum of A, abbreviated 
inf A; 
some authors use the abbreviation 


glb A. 


One detail has been omitted from our discussion so far—the question of which 
sets have at least one, and hence exactly one, least upper bound or greatest 
lower bound. We will consider only least upper bounds, since the question for 
greatest lower bounds can then be answered easily (Problem 2). 

If A is not bounded above, then A has no upper bound at all, so A certainly 
cannot be expected to have a least upper bound. It is tempting to say that A 
does have a least upper bound if it has some upper bound, but, like the principle 
of mathematical induction, this assertion can fail to be true in a rather special 
way. If A = 9, then A is bounded above. Indeed, any number ~ is an upper 
bound for @: 

x >y for every y in 9, 


simply because there is no yin @. Since every number is an upper bound for 9, 
there is ‘surely no least upper bound for 9. With this trivial exception however, 


FIGURE 1 


THEOREM 7-1 


PROOF 


f(x) < 0 for all x 
in this interval 


FIGURE 2 


f(x) > 0 for all x 


in this interval 


FIGURE 3 


Least Upper Bounds 113 


our assertion is true—and very important, definitely important enough to 
warrant consideration of details. We are finally ready to state the last property 
of the real numbers which we need. 


(P13) (The least upper bound property) If A is a set of real numbers, 
A # Ό, and A is bounded above, then A has a least upper bound. 


Property P13 may strike you as anticlimactic, but that is actually one of its 
virtues. To complete our list of basic properties for the real numbers we require 
no particularly abstruse proposition, but only a property so simple that we 
might feel foolish for having overlooked it. Of course, the least upper bound 
property is not really so innocent as all that; after all, it does not hold for the 
rational numbers Ὁ. For example, if A is the set of all rational numbers x 
satisfying x” < 2, then there is no ratzonal number y which is an upper bound 
for A and which is less than or equal to every other ratzonai number which is 
an upper bound for A. It will become clear only gradually how significant 
P13 is, but we are already in a position to demonstrate its power, by supplying 
the proofs which were omitted in Chapter ἦ, 


If f is continuous on [a, ὁ] and f(a) « 0 « ((6δ), then there is some number 
x in [a, ὁ) such that f(x) = 0, 


Our proof is merely a rigorous version of the outline developed at the end of 
Chapter 7—-we will locate the smallest number x in [a, ὁ] with f(x) = 0. 
Define the set A, shown in Figure 1, as follows: 


A = {x: a <x <4, and f is negative on the interval [a, x]}. 


Clearly A # Q@, since ais in 4; in fact, there is some 6 > 0 such that A contains 
all points x satisfying a < x < a+ ὃ; this follows from Problem 6-15, since f 
is continuous on [a, ὁ] and f(a) < 0. Similarly, 6 is an upper bound for A and, 
in fact, there is a 6 > O such that all points x satisfying ὁ --- ὃ < χ < b are 
upper bounds for A; this also follows from Problem 6-15, since f(b) > 0. 

From these remarks it follows that A has a least upper bound ἃ and that 
a <a «9. We now wish to show that f(a) = 0, by eliminating the possi-: 
bilities f(a) < 0 and f(a) > 0. 

Suppose first that f(a) < 0. By Theorem 6-3, there is a 6 > 0 such that 
f(x) < O fora -- 6 <x «αἡ - ὃ (Figure 2). Now there is some number xo in 
A which satisfies a — ὃ < xo < α (because otherwise ἃ would not be the deasi 
upper bound of A). This means that f is negative on the whole interval 
[a, xo]. But if x; is a number between @ and ἃ + ὃ, then f is also negative on 
the whole interval [xo, x1]. Therefore f is negative on the interval [a, x1], so 
x, is in A, But this contradicts the fact that a is an upper bound for A; our 
original assumption that f(@) < 0 must be false. 

Suppose, on the other hand, that f(a) > 0. Then there is a number ὃ > 0 
such that f(x) > 0 fora -- ὃ 4χ « αἀ - ὃ (Figure 3). Once again we know 
that there is an xp in A satisfying a — ὃ < x9 < a; but this means that f is 
negative on [a, xo], which is impossible, since f(xo) > 0. Thus the assumption 


114 Foundations 


THEOREM t 


PROOF 


f(a) > 0 also leads to a contradiction, leaving f(a) = Ὁ as the only possible 
alternative. 


The proofs of Theorems 2 and 3 of Chapter 7 require a simple preliminary 
result, which will play much the same role as Theorem 6-3 played in the previ- 
ous proof. 


If f is continuous at a, then there is a number ὃ > 0 suci: that f is bounded 
above on the interval (a — ὃ, a + δὴ) (see Figure 4). 


Since lim f(x) = f(a), there is, for every ¢ > 0, a 6 > 0 such that, for all x, 


if |x — αἱ < ὃ, then [f(x) — f(a)| < «. 


It is only necessary to apply this statement to some particular ¢ (any one will 
do), forexample, ¢ = 1. We conclude that there is a 6 > 0 such that, for all x, 


if |x -ο αἱ < ὃ, then |f(x) — f(a)| < 1. 


It follows, in particular, that if |x — a] < 6, then f(x) — f(a) « 1. This 
completes the proof: on the interval (a — 6, a + 6) the function f is bounded 
above by f(a) + 1. J 


It should hardly be necessary to add that we can now also prove that f is 
bounded below on some interval (a — 6,a + δ), and, finally, that fis bounded 
on some open interval containing a. 

A more significant point is the observation that if lim f(x) = f(a), then 


FIGURE 4 


(fy): a «γ <x} 


FIGURE 5 


THEOREM 7-2 


PROOF 


THEOREM 7-3 


PROOF 


Least Upper Bounds 115 


there is a 6 > 0 such that f is bounded on the set [x: a < x < a+ 6!,anda 
similar observation holds if lim f(x) = f(4). Having made these observations 
tb" 


(and assuming that you will supply the proofs), we tackle our second major 
theorem. 


If f is continuous on [a, 6], then f is bounded above on [a, ὁ]. 


Let 
A = {x: a <x <6 and f is bounded above on ἴα, x]}. 


Clearly A ¥ @ (since ais in A), and A is bounded above (by ὁ), so A has a least 
upper bound α. Notice that we are here applying the term ‘“‘bounded above”’ 
both to the set A, which can be visualized as lying on the horizontal axis, and 
to f, 1.ε., to the sets { f(y): α < y < x}, which can be visualized as lying on the 
vertical axjs (Figure 5). 

Our first step is to prove that we actually have a = ὁ. Suppose, instead, that 
a < ὁ. By Theorem 1 there is 6 > Osuch that fis bounded on (a — 6, α + δ). 
Since ἃ is the least upper bound of A there is some x9 in A satisfying a — ὃ < 
xo < α. This means that f is bounded on ἴα, xo]. But if x; is any number with 
α «χι <a-+ ὃ, then fis also bounded on [xo, x]. Therefore f is bounded on 
[a, x1], so x; is in A, contradicting the fact that ἃ is an upper bound for A. This 
contradiction shows that a = ὁ. One detail should be mentioned: this demon- 
stration implicitly assumed that a < α (so that f would be defined on some 
interval (α — ὃ, a + 6)); the possibility @ = ἃ can be ruled out similarly, 
using the existence of a 6 > 0 such that f is bounded on {x:a < χ < a+ 6}. 

The proof is not quite complete—we only know that / is bounded on [a, x] 
for every x < ὁ, not necessarily that f is bounded on ἴα, δ]. However, only one 
small argument needs to be added. 

There is a 6 > 0 such that f is bounded on {x: ὁ — ὃ < x « b}. There is 
x9 in A such that ὁ — ὃ < x9 < ὁ. Thus f is bounded on [a, xo] and also on 
[xo, ὁ], so f is bounded on [a, 6]. J 


To prove the third important theorem we resort to a trick. 


If f is continuous on [a, 6], then there is a number y in [a, 4] such that f(y) 2 
f(x) for all x in [a, 6]. 


We already know that f is bounded on [a, 6], which means that the set 


i f(x): x in [a, 5)} 
is bounded. This set is obviously not 9, so it has a least upper bound a. Since 
a > f(x) for x in {a, 6] it suffices to show that a = f(y) for some y in [a, 6]. 
Suppose instead that a + f(y) for all yin [a, 6]. Then the function g defined 
by 


g(x) = x in [a, δ] 


- ὅς 
a -- f(x) 


116 Foundations 


THEOREM 2 


PROOF 


is continuous on [a, 6], since the denominator of the right side is never 0. On 
the other hand, α is the least upper bound of { f(x): x in [a, b]}; this means 
that 


for every ¢ > 0 there is x in [a, ὁ] with a — f(x) < ς. 
This, in turn, means that 
for every € > 0 there is x in [a, ὁ] with g(x) > 1/e. 
But this means that g is not bounded on [a, 6], contradicting the previous 


theorem. i 


At the beginning of this chapter the set of natural numbers N was given as 
an example of an unbounded set. We are now going to prove that N is un- 
bounded. After the difficult theorems proved in this chapter you may be 
startled to find such an “‘obvious” theorem winding up our proceedings. If so, 
you are, perhaps, allowing the geometrical picture of R to influence you too 


᾿ strongly. ‘““Look,”? you may say, “‘the real numbers look like 


0 1 2 3 nx n+i1 


so every number x is between two integers n, n + 1 (unless χ is itself an inte- 
ger). Basing the argument on a geometric picture is not a proof, however, 
and even the geometric picture contains an assumption: that if you place 
unit segments end-to-end you will eventually get a segment larger than any 
given segment. This axiom, often omitted from a first introduction to geom- 
etry, is usually attributed (not quite justly) to Archimedes, and the corre- 
sponding property for numbers, that N is not bounded, is called the Archr- 
median property of the reals. This property is not a consequence of P1—P12 
(see reference [17] of the Suggested Reading), although it does hold for Q, 
of course. Once we have P13 however, there are no longer any problems. 


N is not bounded above. 
Suppose N were bounded above. Since Ν # 4, there would be a least upper 


bound @ for N. Then 


απ for all n in N. 
Consequently, 
a>n+i1 for allnrinN, 


since ἡ +1 isin N if nis in N. But this means that 
α -- 1 Δ π forallninN, 


and this means that a — 1 is also an upper bound for N, contradicting the fact 
that α is the least upper bound. J 


THEOREM 3 


PROOF 


Least Upper Bounds 117 


There is a consequence of Theorem 2 (actually an equivalent formulation) 
which we have very often assumed implicitly. 


For any € > 0 there is a natural number n with 1/n < ξ. 


Suppose not; then 1/n > ¢ for allnin N. Thusn < 1/¢ for all nin N. But this 
means that 178 is an upper bound for N, contradicting Theorem 2. ἢ 


A brief glance through Chapter 6 will show you that the result of Theorem 3 
was used in the discussion of many examples. Of course, Theorem 3 was not 
available at the time, but the examples were so important that in order to 
give them some cheating was tolerated. As partial justification for this dis- 
honesty we can claim that this result was never used in the proof of a theorem, 
but if your faith has been shaken, a review of all the proofs given so far is in 
order. Fortunately, such deception will not be necessary again. We have now 
stated every property of the real numbers that we will ever need. Henceforth, 
no more lies. 


PROBLEMS 
1. Find the least upper bound and the greatest lower bound (if they exist) 
of the following sets. Also decide which sets have greatest and least ele- 


ments (i.e., decide when the least upper bound and greatest lower bound 
happens to belong to the set). 


(1) [:: nin Nn}. 


(ii) 


[7:0 in Zand n x of. 
(iii) {x: x = 0 or x = 1/n for some n in N}. 


ἀν a0 <4 = V2 and x is rational}. 
(v) xi x?+x+1 > 0}. 
(vi) jx: χ +x —1< O}. 


(vii) 
(viii) : ἘΠ Nn} 


xix <QOand χ -Ἐχ -- ῤῖ <0}. 


2. (a) Suppose A # ( is bounded below. Let —A denote the set of all --- 
for xin A. Prove that —A = §), that — A is bounded above, and that 
— sup(—A) is the greatest lower bound of A. 
(Ὁ) If A ¥ @ is bounded below, let B be the set of all lower bounds of A. 
Show that B ¥ @, that B is bounded above, and that sup B is the 
greatest lower bound of A. 


118 Foundations 


*6. 


Let f be a continuous function on [a, ὁ] with f(a) < 0 < f(d). 


(a) The proof of Theorem 1 showed that there is a smallest x in [a, 5] 
with f(x) = 0. Is there necessarily a second smallest x in [a, b| 
with f(x) = 0 ? Show that there is a largest x in [a, 6] with f(x) = 0. 
(Try to give an easy proof by considering a new function closely 
related to f.) 

(b) The proof of Theorem 1 depended upon consideration of A = 
{x: a <x <6 and f is negative on [a, x]}. Give another proof of 
Theorem 1, which depends upon consideration of B = {x:a <x <b 
and f(x) < 0}. Which point x in [a, 6] with f(x) = Q will this proof 
locate? Give an example where the sets A and B are not the same. 


Suppose that f is continuous on [a, 6] and that f(a) = f(b) = 0. Suppose 

also that f(xo) > 0 for some xp in [a, ὁ]. Prove that there are numbers ὁ 

and d with a <¢ < x) « d < ὁ such that f(c) = f(d) = 0, but f(x) > 0 

for all x in (c, 4). Hint: The previous problem can be used to good 

advantage. 

(a) Suppose that y — x > 1. Prove that there is an integer k such that 
x <k <y. Hint: Let / be the largest integer satisfying / < x, and 
consider / + 1. 

(b) Suppose x < y. Prove that there is a rational number 7 such that 
x<r<y. Hint: If 1/n < y — x, then ny — nx > 1. (Query: Why 
have parts (a) and (b) been postponed until this problem set?) 

(c) Suppose that r < ς are rational numbers. Prove that there is an 
irrational number between 7 and s. Hint: As a start, you know that 
there is an irrational number between 0 and 1. 

(4) Suppose that x < y. Prove that there is an irrational number between 
x and y. Hint: It is unnecessary to do any more work; this follows 
from (b) and (c). 

A set A of real numbers is said to be dense if every open interval contains 

a point of A. For example, Problem 5 shows that the set of rational num- 

bers and the set of irrational numbers are each dense. 


(a) Prove that if f is continuous and f(x) = 0 for all numbers x in a 
dense set A, then f(x) = 0 for all x. 

(b) Prove that if f and g are continuous and f(x) = g(x) for all x in a 
dense set A, then f(x) = g(x) for all x. 

(c) If we assume instead that f(x) > g(x) for all x in A, show that 
f(x) > g(x) for all x. Can > be replaced by > throughout? 


Prove that if f is continuous and f(x + y) = f(x) + f(y) for all χα and y, 
then there is a number c such that f(x) = cx for all x. (This conclusion 
can be demonstrated simply by combining the results of two previous 
problems.) Point of information: There do exist noncontinuous functions f 
satisfying f(x + y) = f(x) + f(y) for all x and y, but we cannot prove this 
now; in fact, this simple question involves ideas that are usually never 


Least Upper Bounds 119 


mentioned in any undergraduate course. The Suggested Reading con- 
tains references. 
*8. Suppose that f is a function such that f(a) < f(6) whenever a « ὁ 


(Figure 6). : 
(a) Prove that lim f(x) and lim f(x) both exist. Hint: Why is this 
it problem in this chapter? 


(b) Prove that f never has a removable discontinuity (this terminology 
comes from Problem 6-16). 3 

(c) Prove that if f satisfies the conclusions of the Intermediate Value 
Theorem, then f 1s continuous. 


*9. If fis a bounded function on [0, 1], let |||/||| = sup {[f(x)|: xin [0, 1]}. 
Prove analogues of the properties of || || in Problem 7-14. 
10. Suppose a > 0. Prove that every number x can be written uniquely in 
the form x = ka + x’, where ἃ is an integer, and 0 < x’ < a. 
11. (a) Suppose that a;, a2, a3, . . . is a sequence of positive numbers with 
Qn41 < a/2. Prove that for any € > 0 there is some n with a, < ε. 
(b) Suppose P is a regular polygon inscribed inside a circle. If P’ is the 
inscribed regular polygon with twice as many sides, show that the 
difference between the area of the circle and the area of P’ is less 
than half the difference between the area of the circle and the area 
of P (use Figure 7). 
(c) Prove that there is a regular polygon P inscribed in a circle with 
area as close as desired to the area of the circle. In order to do part 
(c) you will need part (a). This was clear to the Greeks, who used 
part (a) as the basis for their entire treatment of proportion and 
area. By calculating the areas of polygons, this method (πε method 
of exhaustion’’) allows computations of 7 to any desired accuracy; 
Archimedes used it to show that 443 < a < 2%. But it has far 
greater theoretical importance: 
*(d) Using the fact that the areas of two regular polygons with the same 
number of sides have the same ratio as the square of their sides, 
FIGURE 7 prove that the areas of two circles have the same ratios as the square 
of their radii. Hint: Deduce a contradiction from the assumption 
that the ratio of the areas is greater, or less, than the ratio of the 
square of the radii by inscribing appropriate polygons. 
12. Suppose that A and B are two nonempty sets of numbers such that x < y 
for all x in A and all y in B. 


FIGURE 6 


(a) Prove that sup A < y for all y in B. 
(b) Prove that sup A < inf B. 


13. Let A andB be two nonempty sets of numbers which are bounded above, 
and let A + B denote the set of all numbers x + y with x in A and y in B. 
Prove that sup(A + 8) = sup A-+sup 8. Hint: The inequality 


120 Foundations 


14. 


sup(A + B) < sup A + sup B is easy. Why? To prove that sup A + 

sup B < sup (A + δ) it suffices to prove that sup A + sup 8 < sup 

(A + B) + ε for all ¢ > 0; begin by choosing x in A and y in B with 

sup A — x < ¢/2 and sup B — y « ε72. 

BIG REGS Ὁ he ὑπ τ Ὁ τ πε 

(a) Consider ἃ sequence of closed intervals J; = [α:, 61}, J. = [a2, be], .... 
Suppose that a, < a@n4i and basi < θη for all n (Figure 8). Prove 
that there is a point x which is in every Jy. 

(b) Show that this conclusion is false if we consider open intervals 
instead of closed intervals. 


The simple result of Problem 14(a) is called the ‘“‘Nested Interval Theorem.” 
It may be used to give alternative proofs of Theorems 1 and 2. The appropriate 
reasoning, outlined in the next two problems, illustrates a general method, 
called a ‘“‘bisection argument.” 


715. 


*16. 


17. 


*18. 


Suppose f is continuous on [a, 6] and f(a) < 0 < f(b). Then either 
f((a+6)/2) =0, or f has different signs at the end points of [a, (a+ 6)/2], 
or f nas different signs at the end points of [(a + 5)/2, 5]. Why? If 
f((a + 6)/2) 4 0, let J, be one of the two intervals on which f changes 
sign. Now bisect /;. Either f is 0 at the midpoint, or f changes sign on 
one of the two intervals. Let 7, be such an interval. Continue in this 
way, to define J, for each n (unless f is 0 at some midpoint). Use the 
Nested Interval Theorem to find a point χα where f(x) = 0. 

Suppose f were continuous on [a, ὁ], but not bounded on [a, δ]. Then 
f would be unbounded on either [a, (a + 6)/2] or [(a + 64)/2, δ]. Why? 
Let J; be one of these intervals on which f is unbounded. Proceed as in 
Problem 15 to obtain a contradiction. 

(a) Let A = {x: x < αἱ. Prove the following (they are all easy): 


(i) Ifx isin A and y < x, then y is in A. 

(ii) Α 5 [. 

(iii) A #R. 

(iv) If x is in A, then there is some number x’ in A such that x < x’. 
(b) Suppose, conversely, that A satisfies (i)-(iv). Prove that A = 

{x: x < sup 4}. 
A number x is called an almost upper bound for A if there are only 
finitely many numbers y in A with y > x. An almost lower bound is 
defined similarly. 


(a) Find all almost upper bounds and almost lower bounds of the sets 
in Problem 1. 

(b) Suppose that A is a bounded infinite set. Prove that the set B of all 
almost upper bounds of A is nonempty, and bounded below. 


19. 


*20. 


Least Upper Bounds 121 


(c) It follows from part (b) that inf B exists; this number is called-the 
limit superior of A, and denoted by lim A or lim sup A. Find lim 4 
for each set A in Problem 1. 

(d) Define lim A, and find it for all A in Problem 1. 


If A is a bounded infinite set prove 
(a) lim A < lim A. 
(b) lim A < sup A. 


(c) If lim A < sup A, then A contains a largest element. 
(d) The analogues of parts (b) and (c) for lim. 


SLY 


shadow points 


FIGURE 9 


Let f be a continuous function on R. A point x is called a shadow point 
of f if there is a number y > x with f(y) > f(x). The rationale for this 
terminology is indicated in Figure 9; the parallel lines are the rays of 
the sun rising in the east (you are facing north). Suppose that all points 
of (a, δ) are shadow points, but that ἃ and ὁ are not shadow points. 


(a) For x in (a, δ), prove that f(x) < f(b). Hint: Let A = {y:x <y <b 
and f(x) < f(y)}. If sup A were less than ὁ, then sup A would be a 
shadow point. Use this fact to obtain a contradiction to the fact that 
b is not a shadow point. 

(Ὁ) Now prove that f(a) < f(4). (This is a simple consequence of 

continuity.) 

Finally, using the fact that a is not a shadow point, prove that 


f(a) = fd). 


This result is known as the Rising Sun Lemma. Aside from serving 
as a good illustration of the use of least upper bounds, it is instru- 
mental in proving several beautiful theorems that do not appear in 
this book; see page 371. 


Ne 


(c 


PART Θ 


DERIVATIVES 
AND © 
INTEGRALS 


In 1604, at the herght of 

his scientific career, Galileo argued 
that for a rectilinear motion 

in which speed increases proportionally 
to distance covered, 

the law of motion should be 
Just that (x = ct?) 

which he had discovered 

in the investigation of falling bodies. 
Between 1695 and 1700 

not a single one of the monthly issues 
of Lewpzig’s Acta Eruditorum was published 
without articles of Letbniz, 

the Bernoulli brothers 

or the Marquts de Hopital treating, 
with notation only slightly different from 
that which we use today, 

the most varied problems of 
differential calculus, integral calculus 
and the calculus of variations. 

Thus in the space of almost precisely 
one century 

infinitesimal calculus or, 

as we now call it in English, 

The Calculus, 

the calculating tool par excellence, 

had been forged ; 

and nearly three centuries of 

constant use have not completely dulled 
this incomparable instrument. 


NICHOLAS BOURBAKI 


FIGURE 2 


CHAPTER 


DERIVATIVES 


The derivative of a function is the first of the two major concepts of this section. 
Together with the integral, it constitutes the source from which calculus 
derives its particular flavor. While it is true that the concept of a function is 
fundamental, that you cannot do anything without limits or continuity, and 
that least upper bounds are essential, everything we have done until now has 
been preparation—if adequate, this section will be easier than the preceding 
ones—for the really exciting ideas to come, the powerful concepts that are 
truly characteristic of calculus. 

Perhaps (some would say “‘certainly’’) the interest of the ideas to be intro- 
duced in this section stems from the intimate connection between the mathe- 
matical concepts and certain physical ideas. Many definitions, and even some 
theorems, may be described in terms of physical problems, often in a revealing 
way. In fact, the demands of physics were the original inspiration for these 
fundamental ideas of calculus, and we shall frequently mention the physical 
interpretations. But we shall always first define the ideas in precise mathe- 
matical form, and discuss their significance in terms of mathematical problems. 

The collection of all functions exhibits such diversity that there is almost no 
hope of discovering any interesting general properties pertaining to all. 
Because continuous functions form such a restricted class, we might expect to 
find some nontrivial theorems pertaining to them, and the sudden abundance 
of theorems after Chapter 6 shows that this expectation 15 justified. But the 
most interesting and most powerful results about functions will be obtained 
only when we restrict our attention even further, to functions which have even 
greater claim to be called ‘“‘reasonable,”’ which are even better behaved than 
most continuous functions. 

f(x) = jal, x 2 0 
f(x) =e, x <0 
f(x) = |e! 


10) = ναὶ 


(a) (b) | (c) 


FIGURE 1 
Figure 1 illustrates certain types of misbehavior which continuous functions 
can display. The graphs of these functions are “‘bent”’ at (0, 0), unlike the 


graph of Figure 2, where it is possible to draw a “‘tangent line” at each point. 
The quotation marks have been used to avoid the suggestion that we have 


125 


126 Derivatives and Integrals 


FIGURE 3 


FIGURE 4 


defined ‘‘bent’”’ or “‘tangent line,” although we are suggesting that the graph 
might be “bent” at a point where a ‘‘tangent line’? cannot be drawn. You 
have probably already noticed that a tangent line cannot be defined as a line 
which intersects the graph only once—such a definition would be both too 
restrictive and too permissive. With such a definition, the straight line shown 
in Figure 3 would not be a tangent line to the graph in that picture, while the 
parabola would have two tangent lines at each point (Figure 4), and the three 
functions in Figure 5 would have more than one tangent line at the points 
where they are “‘bent.”’ | 


(a) (b) (c) 


FIGURE 5 


A more promising approach to the definition of a tangent line might start 
with ‘‘secant lines,”’ and use the notion of limits. If h # 0, then the two distinct 
points (a, f(a)) and (a + A, f(a + A)) determine, as in Figure 6, a straight line 
whose slope is 

fla +h) = fla) 
h 


(a + A, f(a + A)) 
! 
| 
εὐ 18: h) -- fla) 
| 
| 


(a, (ς΄. : 


FIGURE 6 


As Figure 7 illustrates, the ‘‘tangent line’’ at (a, f(a)) seems to be the limit, in 
some sense, of these ‘“‘secant lines,” as h approaches 0. We have never before 
talked about a “‘limit”’ of lines, but we can talk about the limit of their slopes: 


f 


DEFINITION 


ES 


fl (a, f(a)) 


FIGURE 7 


Derivatives 127 


the slope of the tangent line through (a, f(a)) should be 
m2 2 = “- Κα) 


We are ready for a definition, and some comments. 


The function f is differentiable at a if 


lath) = fla) 


h-O A 


exists. 


In this case the limit is denoted by f’(a) and is called the derivative of f at 
a. (We also say that f is differentiable if f is differentiable at a for every a 
in the domain of f.) 


The first comment on our definition is really an addendum; we define the 
tangent line to the graph of f at (a, f(a)) to be the line through (a, f(a)) with 
slope f’(a). This means that the tangent line at (a, f(a)) is defined only if f is 
differentiable at a. | 

The second comment refers to notation. The symbol /’(a) is certainly 
reminiscent of functional notation. In fact, for any function ἢ, we denote by f’ 
the function whose domain is the set of all numbers a such that f is differ- 
entiable at a, and whose value at such a number a is 


een fa t ἢ = fla), 


h-0 h 


(To be very precise: 77 is the collections of all pairs 


(«, tim Ae +#) =f) 


h-0 
for which lim [f(a + A) — f(a)]/A exists.) The function ΚΓ is called the 
h->0 


derivative of f. 

Our third comment, somewhat longer than the previous two, refers to the 
physical interpretation of the derivative. Consider a particle which is moving 
along a straight line (Figure 8(a)) on which we have chosen an “origin” point Q, 
and a direction in which distances from O shall be written as positive numbers, 
the distance from O' of points in the other direction being written as negative 
numbers. Let s(t) denote the distance of the particle from O, at time ¢t. The 
suggestive notation s() has been chosen purposely; since a distance s(¢) is 


i= 0 t=1 


motion of the particle 


--------τεονῷ- ------- τ 0 -- 
FIGURE 8&(a) 0 line along which particle is moving 


128 Deriwatives and Integrals 


determined for each number ?¢, the physical situation automatically supplies us 
with a certain function s. ‘The graph of s indicates the distance of the particle 
from O, on the vertical axis, in terms of the time, indicated on the horizontal 
axis (Figure 8(b)). 

The quotient 


**distance”’ graph of ς 


σία - Δ) -- s(a) 
ἦὀ 


tne 
has a natural physical interpretation. It is the “average velocity” of the parti- 
cle during the time interval from a to a + A. For any particular a, this average 
speed depends on ἡ, of course. On the other hand, the limit 


. sla +h) — s(a) 
lim. -————__—+ 
h-0 A 


FIGURE 800) 


depends only on a (as well as the particular function s) and there are important 
physical reasons for considering this limit. We would like to speak of the 
“velocity of the particle at time a,’? but the usual definition of velocity is 
really a definition of average velocity; the only reasonable definition of 
‘velocity at time a” (so-called “instantaneous velocity’’) is the limit 


. σία +h) — s(a) 
lim ———_—~.. 
h-0 ἠ 


Thus we define the (instantaneous) velocity of the particle at a to be s’(a). 
Notice that s’(a) could easily be negative; the absolute value |s’(a)| is some- 
times called the (instantaneous) speed. 

It is important to realize that instantaneous velocity is a theoretical concept, 
an abstraction which does not correspond precisely to any observable quan- 
tity. While it would not be fair to say that instantaneous velocity has nothing 
to do with average velocity, remember that s’(f) is not 


SES ἢ) 50) 
h 


for any particular ἡ, but merely the limit of these average velocities as ἢ 
approaches 0). Thus, when velocities are measured in physics, what a physicist 
really measures is an average velocity over some (very small) time interval; 
such a procedure cannot be expected to give an exact answer, but this is 
really no defect, because physical measurements can never be exact anyway. 

The velocity of a particle is often called the “rate of change of its position.” 
This notion of the derivative, as a rate of change, applies to any other physical 
situation in which some quantity varies with time. For example, the “rate of 
change of mass” of a growing object means the derivative of the function m, 
where m(t) is the mass at time ἢ. 

In order to become familiar with the basic definitions of this chapter, we 
will spend quite some time examining the derivatives of particular functions. 
Before proving the important theoretical results of Chapter 11, we want to 


FIGURE 9 


Derivatives 129 


have a good idea of what the derivative of a function looks like. The next 
chapter is devoted exclusively to one aspect.of this problem—calculating the 
derivative of complicated functions. In this chapter we will emphasize the 
concepts, rather than the calculations, by considering a few simple examples. 
Simplest of all is a constant function, f(x) = c. In this case 


lm 14 tf) — A) ey ee ᾿Ξ ἢ, 
h0 h ro A 
Thus f is differentiable at a for every number a, and f(a) = 0. This means 
that the tangent line to the graph of f always has slope 0, so the tangent line 
always coincides with the graph. 
Constant functions are not the only ones whose graphs coincide with their 
tangent lines—this happens for any linear function f(x) = cx + d. Indeed 


h 3 
Aisin lee 21 

h->0 ἠ 
τ tae hy id = lea Fa) 

= lim ----------- - ---- - - 
h-0 h 
. ch 

= lim— = δ; 
ro ἢ 

the slope of the tangent line is c, the same as the slope of the graph of 7. 
A refreshing difference occurs for f(x) = x®. Here 


i= 
f'(a) = lim 12 τ ) f(a) 
h-0 h 
OE ΞΡ) ἘΞ ἃ 
= lim ------------- 
h0 h 
. @+2aht+ ἐδ — a’ 
= lim ----------------- 
h-0 h 
= lim (2a + h) 
h-0 
= 2a. 
Some of the tangent lines to the graph of f are shown in Figure 9. In this pic- 
ture each tangent line appears to intersect the graph only once, and this fact 
can be checked fairly easily: Since the tangent line through (a, a’) has slope 2a, 
it is the graph of the function 
2a(x -- a) + a? 


= Jax — a’. 


Now, if the graph of f and g intersect at a point (x, f(x)) = (x, g(x)), then 


g(x) 


- x? = 2ax — a? 
or x? — 2ax + a? = 0; 
so (x —a)?=0 
or ἘΞ τ 


In other words, (a, a?) is the only point of intersection. 


130 Derivatives and Integrals 


FIGURE 10 


The function f(x) = x? happens to be quite special in this regard; usually 
a tangent line will intersect the graph more than once. Consider, for example, 
the function f(x) = x*. In this case 


fla +h) -- 74) 


f(a) = lim 
h-0 


h 
ἘΠῚ τ ae 
h0 
— lim ae + 3a*h + 3ah? + δ — a3 
h0 h 
apg 3a*h + 3ah? + A 
h0 A 
= lima 3a? + 3ah + h? 
= 236". 


Thus the tangent line to the graph of f at (a, a*) has slope 3a’. This means 
that the tangent line is the graph of 


| 


3a2(x — a) + a3 


3a*x — 2a. 


g(x) 


| 


The graphs of f and g intersect at the point (x, f(x)) = (x, g(x)) when 


x? = 3a2x — 2a3 
or x? — 2εαἷχ + 2a? = 0. 


This equation is easily solved if we remember that one solution of the equation 
has got to be x = a, so that (x — a) is a factor of the left side; the other factor 
can then be found by dividing. We obtain 


0. 


(x -- a)(x? + ax — 2a?) 
It so happens that x? + ax — 2a? also has x — a asa factor; we obtain finally 


(x — a)(x — a)(x + 2a) = 0. 


Thus, as illustrated in Figure 10, the tangent line through (a, a*) also inter- 
sects the graph at the point (—2a, —8a*). These two points are always dis- 
tinct, except when a = 0. 

We have already found the derivative of sufficiently many functions to 
illustrate the classical, and still very popular, notation for derivatives. For a 
given function f, the derivative f’ is often denoted by 


df (x) 
dx 
For example, the symbol 


Derivatives 131 


denotes the derivative of the function f(x) = x*. Needless to say, the separate 
parts of the expression 
af (x) 
dx 


are not supposed to have any sort of independent existence—the a’s are not 
numbers, they cannot be canceled, and the entire expression is not the quotient 
of two other numbers ‘‘df(x)’? and ‘“‘dx.’? This notation is due to Leibniz 
(generally considered an independent co-discoverer of calculus, along with 
Newton), and is affectionately referred to as Leibnizian notation.* Although 
the notation df(x)/dx seems very complicated, in concrete cases it may be 
shorter; after all, the symbol dx?/dx is actually more concise than the phrase 
“the derivative of the function f(x) = x.” 

The following formulas state in standard Leibnizian notation all the infor- 
mation that we have found so far: 

dc 


= 22.0), 
ax 
d(ax +b) _ ᾿ 
εἶχ ὶ 
2 
ate ae 
dx 
12: =e ba 
εἶχ 


Although the meaning of these formulas is clear enough, attempts at literal 
interpretation are hindered by the reasonable stricture that an equation should 
not contain a function on one side and a number on the other. For example, if 
the third equation is to be true, then either df(x) /dx must denote f(x), rather 
than f’, or else 2x must denote, not a number, but the function whose value at 
x is 2x. It is really impossible to assert that one or the other of these alternatives 
is intended; in practice df(x)/dx sometimes means f’ and sometimes means 
f'(x), while 2x may denote either a number or a function. Because of this 
ambiguity, most authors are reluctant to denote /’(a) by 


41) 


ΕἾ (a); 


instead f’(a) is usually denoted by the barbaric, but unambiguous, symbol 


ἅ 


dx |\r=0 


* Leibniz was led to this symbol by his intuitive notion of the derivative, which he considered 
to be, not the limit of quotients [f(x + 4) — f(x)]/A, but the “value” of this quotient when A 
is an “infinitely small”? number. This “‘infinitely small’ quantity was denoted by dx and the 
corresponding “‘infinitely small’ difference f(x + dx) — f(x) by df(x). Although this point 
of view is impossible to reconcile with properties (P1)—(P13) of the real numbers, some people 
find this notion of the derivative congenial. 


192 Derivatives and Integrals 


In addition to these difficulties, Leibnizian notation is associated with one 
more ambiguity. Although the notation dx?/dx is absolutely standard, the 
notation d/(x)/dx is often replaced by df/dx. This, of course, is in conformity 
with the practice of confusing a function with its value at x. So strong 15 this 
tendency that functions are often indicated by a phrase like the following: 
‘consider the function y = x?.’’? We will sometimes follow classical practice to 
the extent of using y as the name of a function, but we will nevertheless care- 
fully distinguish between the function and its values—thus we will always say 
something like ‘“‘consider the function (defined by) y(x) = x?.” 

Despite the many ambiguities of Leibnizian notation, it is used almost 
exclusively in older mathematical writing, and is still used very frequently 
today. The staunchest opponents of Leibnizian notation admit that it will be 
around for quite some time, while its most ardent admirers would say that it 
will be around forever, and a good thing too! In any case, Leibnizian notation 
cannot be ignored completely. 

The policy adopted in this book is to disallow Leibnizian notation within 
the text, but to include it in the Problems: several chapters contain a few 
(immediately recognizable) problems which are expressly designed to illus- 
trate the vagaries of Leibnizian notation. Trusting that these problems will 
provide ample practice in this notation, we return to our basic task of examin- 
ing some simple examples of derivatives. 

The few functions examined so far have all been differentiable. To fully 
appreciate the significance of the derivative it is equally important to know 
some examples of functions which are not differentiable. The obvious candi- 
dates are the three functions first discussed in this chapter, and illustrated in 
Figure 1; if they turn out to be differentiable at 0 something has clearly gone 
wrong. 

Consider first f(x) = |x|. In this case 


ΚΟ -Ὁ ἃ) — Κ μὰ] 


h h 


Now |A|/h = 1 for h > 0, and |A|/h = —1 for A < 0. This shows that 


im £4) = ΠΟ) 


μ- 


does not exist. 


In fact, 


pe IOI) 2 4 


h— Ot A 
and lim f(h) — Κθ) ea ἢ 
h-0- ἢ 


(These two limits are sometimes called the right-hand derivative and the 
left-hand derivative, respectively, of f at 0.) 


Derivatives 133 


If a ~ 0, then f’(a) does exist. In fact, 


f Pie) at [[χ»Ό, 
fixe) = -1 ifx «0. 


The proof of this fact is left to you (it is easy if you remember the derivative of 
a linear function). The graphs of f and of f’ are shown in Figure 11. 
For the function 


ee τ 


ae " χ ae 


a similar difficulty arises in connection with ((0). We have 


—=fA, Ah<0 
FIGURE 11 f(h) — f(0) _ h 
A h 
~=1, h>dO 
h 
y Therefore, 
τ τ 
h-0- A 
h- 0* 


Thus f’(0) does not exist; f is not differentiable at 0. Once again, however, 
f'(x) exists for x # 0—it is easy to see that 

j LO Zs. 9 

es ee > Ὁ: 


The graphs of f and f’ are shown in Figure 12. 
Even worse things happen for f(x) = V |x|. For this function 


— = — =) h>0O 
FIGURE 12 f(r) — f(0) 7 h h 


h V —h 1 


In this case the right-hand limit 


h-0+ h ror Wf 
does not exist; instead 1/ Vh becomes arbitrarily large as h approaches 0. 
And, what’s more, — 1/ ν΄ —h becomes arbitrarily large in absolute value, but 


FIGURE 13 negative (Figure 13). 


134 Derivatives and Integrals 


FIGURE 14 


THEOREM 1 


PROOF 


The function f(x) = Vv x, although not differentiable at 0, is at least a little 
better behaved than this. ‘The quotient 


fth) —f) Wh Ave 1 1 


h h hee ἀν 


simply becomes arbitrarily large as h goes to 0. Sometimes one says that f has 
an ‘infinite’ derivative at 0. Geometrically this means that the graph of f has 
a ‘tangent line”? which is parallel to the vertical axis (Figure 14). Of course, 
f(x) = - x has the same geometric property, but one would say that f has 
a derivative of “‘negative infinity” at 0. 

Remember that differentiability is supposed to be an improvement over 
mere continuity. This idea is supported by the many examples of functions 
which are continuous, but not differentiable; however, one important point 
remains to be noted: 


If f is differentiable at a, then f is continuous at a. 


lim f(a +h) — f(a) = lim SA — Ko ai 


= limit 1 ἢ) —f® | |, 


mh 
h-0 h h—0 
= f'(a)-0 
= 0. 


As we pointed out in Chapter 5, the equation lim f(a + h) — f(a) = 0 is 
h->0 


equivalent to lim f(x) = f(a); thus f is continuous at a. J 
τα 

It is very important to remember Theorem 1, and just as important to 
remember that the converse is not true. A differentiable function is continuous, 
but a continuous function need not be differentiable (keep in mind the func- 
tion f(x) = |x|, and you will never forget which statement is true and which 
false). 

The continuous functions examined so far have been differentiable at all 
points with at most one exception, but it is easy to give examples of continuous 
functions which are not differentiable at several points, even an infinite number 
(Figure 15). Actually, one can do much worse than this. There is a function 


FIGURE 15 


Derivatives 135 


(a) (b) 
(c) (d) 


FIGURE 16 


which is continuous everywhere and differentiable nowhere! Unfortunately, the defini- 
tion of this function will be inaccessible to us until Chapter 23, and I have been 
unable to persuade the artist to draw it (consider carefully what the graph 
should look like and you will sympathize with his point of view). It is possible 
to draw some rough approximations to the graph, however; several succes- 
sively better approximations are shown in Figure 16. 

Although such spectacular examples of nondifferentiability must be post- 
poned, we can, with a little ingenuity, find a continuous function which is not 
differentiable at infinitely many points, all of which are in [0, 1]. One such func- 
tion is illustrated in Figure 17. The reader is given the problem of defining it 
precisely; it is a straight line version of the function 


1 
xsin— x #0 

f(x) = x 
0, x = 0. 


This particular function f is itself quite sensitive to the question of differenti- 
FIGURE 17 ability. Indeed, for h ~ 0 we have © 


156 Derivatives and Integrals 


1 
Asin- — 0 
(Ae) h mets 
———__—- = ——____— = sin-- 
h ἀ h 


Long ago we proved that lim sin 1/h does not exist, so f is not differentiable 
h 


—0 
at 0. Geometrically, one can see that a tangent line cannot exist, by noting 
that the secant line through (0,0) and (A, f(A)) in Figure 18 can have any slope 
between —1 and 1, no matter how small we require ἡ to be. 


FIGURE 18 


This finding represents something of a triumph; although continuous, the 
function f seems somehow quite unreasonable, and we can now enunciate one 
mathematically undesirable feature of this function—it is not differentiable 
at 0. Nevertheless, one should not become too enthusiastic about the criterion 
of differentiability. For example, the function 


. 1 
x?sin-— x <0 


glx) = x 
0, x = 0 
* 
7s differentiable at 0; in fact g’(0) = 0: 
1 
δ" sin — 
Ki g(h) — g(0) OT aie 
μ»-- h h-0 h 
᾿ | 
= lim ὦ sin -- 
h-0 
= Ὁ, 


The tangent line to the graph of g at (0, 0) is therefore the horizontal axis 
(Figure 19). 

This example suggests that we should seek even more restrictive conditions 
on a function than mere differentiability. We can actually use the derivative to 


Derivatives 137 


.. ἴ 
x? sin-> x “Ὁ 
x 


0, x=0 


710) = 


FIGURE 19 


formulate such conditions if we introduce another set of definitions, the last of 
this chapter. 

For any function f, we obtain, by taking the derivative, a new function /’ 
(whose domain may be considerably smaller than that of ἢ. The notion of 
differentiability can be applied to the function /’, or course, yielding another 
function (f’)’, whose domain consists of all points a such that {7 is differentiable 
at a. The function (/’)’ is usually written simply ζ΄ and is called the second 
derivative of f. If f’’(a) exists, then f is said to be 2-times differentiable at 
a, and the number /’’(a) is called the second derivative of f at a. 

In physics the second derivative is particularly important. If s(t) is the posi- 
tion at time ¢ of a particle moving along a straight line, then s’’(¢) is called the 
acceleration at time ¢. Acceleration plays a special role in physics, because, as 
stated in Newton’s laws of motion, the force on a particle is the product of its 
mass and its acceleration. Consequently you can feel the second derivative 
when you sit in an accelerating car. 

There is no reason to stop at the second derivative—we can define ζ΄ = 
(77), 2 = ({77’, etc. This notation rapidly becomes unwieldy, so the fol- 
lowing abbreviation is usually adopted (it is really a recursive definition): 


ie 
fern = (fy’ 
Thus 
fo =f 


etc. 


The various functions Κι), for k > 2, are sometimes called higher-order 
derivatives of ἢ. 

Usually, we resort to the notation /™ only for k > 4, but it is convenient to 
have f defined for smaller ἀ also. In fact, a reasonable definition can be made 
for f, namely, 


fO =f. 


Leibnizian notation for higher-order derivatives should also be mentioned. 


138 Derivatives and Integrals 


10) = Ὁ,.κ..3 


(d) 
FIGURE 20 


The natural Leibnizian symbol for f’’(x), namely, 


(£2) 
dx 
dx 
is abbreviated to 
a? 2 
7.4) ) or more frequently to eA 
(any dx? 


Similar notation is used for f( (x). 

The following example illustrates the notation 506), and also shows, in one 
very simple case, how various higher-order derivatives are related to the 
original function. Let f(x) = x’. Then, as we have already checked, 


f(x) = 2x, 
f(x) = 2, 
f(x) a3 0, 


f(x) = 0, if k > 3. 


Figure 20 shows the function /, together with its various derivatives. 
A rather more illuminating example is presented by the following function, 
whose graph is shown in Figure 21 (a): 


It is easy to see that 
f(a) = 2a ifa> QO, 
f(a) = —2a ifa «0. 


Moreover, 
nf ey -- (0) 
0) = im ——————— 
Γ΄ (0) lim A 
τὼ 
ro A 
Now 
lim Hh) = lim = 0 
hot A h—+0t+ A 
is 2 
and lim Fh) = lim = = 0, 
h-o- A ho- A 
50 
7 (0) ΞΕ li ae = 0 


This information can all be summarized as follows: 
f(x) = 2|x]. 


It follows that f’’(0) does not exist! Existence of the second derivative is thus a 


f'(x) = 2|χ] 


f(x) = 2,x>0 


fi'(x) = -2,x<0 
(c) 


FIGURE 21 


Derivatives 139 


rather strong criterion for a function to satisfy. Even a “‘smooth looking”’ func- 
tion like f reveals some irregularity when examined with the second derivative. 
This suggests that the irregular behavior of the function 


1 
x?sin-> x ¥ 0 

g(x) = x 
0, x =0 


might also be revealed by the second derivative. At the moment we know that 
σ'(Ο) = 0, but we do not know g’(a) for any a # 0, so it is hopeless to begin 
computing g’’(0). We will return to this question at the end of the next chapter, 
after we have perfected the technique of finding derivatives. 


PROBLEMS 


1. (a) Prove, working directly from the definition, that if f(x) = 1 /x, then 
f'(a) = —1/a?, for a ¥ 0. 
(b) Prove that the tangent line to the graph of f at (a, 1/a) does not 
intersect the graph of f, except at (a, 1/a). 


2. (a) Prove that if f(x) = 1/x?, then f’(a) = —2/a° for a # 0. 
(b) Prove that the tangent line to f at (a, 1/a?) intersects f at one other 
point, which lies on the opposite side of the vertical axis. 
3. Prove that if f(x) = ./x,thenf’(a) = 1/2,/a, for a > 0. (The expression 
you obtain for [f(a +h) — f(a)]/A will require some algebraic face 
lifting, but the answer should suggest the right trick.) 


4. For each natural number n, let S,,(x) = x”. Remembering that 
S\'(x) = 1, So’(x) = 2x, and S,’(x) = 3x?, conjecture a formula for 
S,,’(x). Prove your conjecture. (The expression (x + 4)” may be expanded 
by the binomial theorem.) 


5. Find f’ if f(x) = [x]. 

6. Prove, starting from the definition (and drawing a picture to illustrate): 
(a) if g(x) = f(x) +c, then σ΄ ἃ) = [ Ὁ); 
(b) if g(x) = of(x), then g’(x) = ζ΄ Ὁ). 

7. Suppose that f(x) = x°. 
(a) What is f’(9), f’(25), [ (66) 


(b) What is f(3”), (55, f°(6?)? 

(c) What is f’(a?), f’(x?)? 

If you do not find this problem silly, you are missing a very important 
point: f’(x?) means the derivative of f at the number which we happen 


to be calling x; it is not the derivative at x of the function g(x) = f(x’). 
Just to drive the point home: 


(4) For f(x) = x3, compare f’(x?) and g’(x) where g(x) = Tx): 


140 Derivatives and Integrals 


10. 


11. 


12. 


(a) Suppose g(x) = f(x +c). Prove (starting from the definition) that 
g(x) = f(x -Ῥ ὁ). Draw a picture to illustrate this. To do this prob- 
lem you must write out the definitions of g’(x) and f’(x +c) cor- 
rectly. The purpose of Problem 7 was to convince you that although 
this problem is easy, it is not an utter triviality, and there is some- 
thing to prove: you cannot simply put prime marks into the equation 
g(x) = f(x +a). Toemphasize this point: 

(b) Prove that if g(x) = f(cx), then g’(x) = c- f’(cx). Try to see pictori- 
ally why this should be true, also. 

(c) Suppose that f is differentiable and periodic, with period a (i.e., 
f(x + a) = f(x) for all x). Prove that f’ is also periodic. 

Find f’(x) and also f’(x + 3) in the following cases. Be very methodical, 

or you will surely slip up somewhere. Consult the answers (after you do 

the problem, naturally). 


(i) JOP =e a). 
(ii) f(x - 3) = x, 
(iii) f(x +3) = ( + 5)7. 
Find f’(x) if f(x) = g(t - x), and if f(t) = g(t + x). The answers will 
not be the same. 
(a) Prove that Galileo was wrong: if a body falls a distance s(¢) in ἐ sec- 
onds, and ς΄ is proportional to s, then s cannot be a function of the 
form s(t) = ct?. 


(b) Prove that the following facts are true about s if s(t) = (a/2)t? (the 
first fact will show why we switched from ¢ to a/2): 


(i) s(t) = a (the acceleration is constant). 
(ii) [s’(¢)]? = 2as(Z). 

(c) Ifs is measured in feet, the value of a is 32. How many seconds do 
you have to get out of the way of a chandelier which falls from a 400- 
foot ceiling? If you don’t make it, how fast will the chandelier be 
going when it hits you? Where was the chandelier when it was mov- 
ing with half that speed? 

Imagine a road on which the speed limit is specified at every single point. 

In other words, there is a certain function L such that the speed limit x 

miles from the beginning of the road is L(x). Two cars, A and B, are 

driving along this road; car A’s position at time ¢ is a(é), and car B’s is 

b(t). 

(a) What equation expresses the fact that car A always travels at the 
speed limit? (The answer is not a’(t) = L(t).) 

(b) Suppose that A always goes at the speed limit, and that B’s position 
at time ¢ is A’s position at time ¢ — 1. Show that B is also going at 
the speed limit at all times. 

(c) Suppose B always stays a constant distance behind A. Under what 
conditions will B& still always travel at the speed limit? 


13. 


14. 


*15. 


16. 
17. 


*18, 


19. 


20. 


FIGURE 22 


Derivatives 141 


Suppose that f(a) = g(a) and that the left-hand derivative of f at a equals 
the right-hand derivative of g at a. Define h(x) = f(x) for x < a, and 
h(x) = g(x) for x > a. Prove that ᾧ is differentiable at a. 

Let f(x) = x7 if x is rational, and f(x) = 0 ifx is irrational. Prove that f is 

differentiable at 0. (Don’t be scared by this function. Just write out the 

definition of f’(0).) 

(a) Let f be a function such that | f(x)| < x? for all x. Prove that f is 
differentiable at 0. (If you have done Problem 14 you should be able 
to do this.) 

(b) This result can be generalized if x? is replaced by | g(x)|, where g has 
what property? 

Let a > 1. If f satisfies | f(x)| < |x|*, prove that f is differentiable at 0. 

Let 0 < B < 1. Prove that if f satisfies | f(x)| > |x|’ and f(0) = 0, then 

f is not differentiable at 0. 

Let f(x) = 0 for irrational x, and 1/q for x = p/q in lowest terms. Prove 

that f is not differentiable at a for any a. Hint: It obviously suffices to 


prove this for irrational a. Why? If @ = n.ajaca3 . . . is the decimal 
expansion of a, consider [f(a +h) — f(a)]/hA for h rational, and also for 
ἀ = — 0.00 δ ie ἐν θα;.μτᾶ5...2 πὴ τῳ ἃ 


(a) Suppose that f(a) = g(a) = h(a), that f(x) < g(x) < A(x) for all x, 
and that f’(a) = A’(a). Prove that g is differentiable at a, and that 
f(a) = g'(a) = h’(a). (Begin with the definition of g’(a).) 

(b) Show that the conclusion does not follow if we omit the hypothesis 
f(a) = g(a) = βία). 

Let f be any polynomial function; we will see in the next chapter that 

fis differentiable. The tangent line to f at (a, f(a)) isthe graph of g(x) = 

f'(a)(x — a) + f(a). Thus f(x) — g(x) is the polynomial function d(x) = 

f(x) — f’(a)(x — a) — f(a). We have already seen that if f(x) = x?, then 

d(x) = (x — a)*, and if f(x) = x3, then d(x) = (« — a)?(x 4 2a). 


(a) Find d(x) when f(x) = x4, and show that it is divisible by (x — a)’. 

(b) There certainly seems to be some evidence that d(x) is always 
divisible by (x — a)?. Figure 22 provides an intuitive argument: 
usually, lines parallel to the tangent line will intersect the graph at 
two points; the tangent line intersects the graph only once near the 
point, so the intersection should be a ‘‘double intersection.” To give 
a rigorous proof, first note that 


d(x) _ f(x) — f@) = (a) 


xX —- @ x 


— f(a). 


Now answer the following questions. Why is f(x) — f(a) divisible by 
(x — a)? Why is there a polynomial function A such that h(x) = 
d(x)/(x — a) for x κέ a ? Why is lim h(x) = 0? Why is A(az) = 0? 


Why does this solve the Sigmiene 


142 Derivatives and Integrals 


21. (a) Show that f’(a) = lim [f(x) — f(a)]/(x — a). (Nothing deep here.) 


(b) Show that derivatives are a “local property”’: if f(x) = g(x) for all x 
in some open interval containing a, then f’(a) = g’(a). (This means 
that in computing /’(a), you can ignore f(x) for any particular x # a. 
Of course you can’t ignore f(x) for all such x at once!) 
*22. (a) Suppose that f is differentiable at x. Prove that 
h-0 2h 


Hint: Remember an old algebraic trick—a number is not changed 
if the same quantity is added to and then subtracted from it. 
**(b) Prove, more generally, that 


fa +h) — fle — ἢ 


7 τοὶ li 
Γι ae Atk 
*23. Prove that if f is even, then f’(x) = —/f’(—x). (In order to minimize 


confusion, let g(x) = f(—.x); find g’(x) and then remember what other 
thing g is.) Draw a picture! 
24. Prove that if f is odd, then f’(x) = f’(—x). Once again, draw a picture. 
25. Problems 23 and 24 say that f’ is even if f is odd, and odd if f is even. 
What can therefore be said about f? 
26. Find f’’(x) if 
(i) f(x) = x3. 
(ii) f(x) = x° 
(ii) f'@) = x4. 
(iv) fle + 3) = x’. 
27. If S,(x) = x", and 0 < k <n, prove that 
n! 


(n — k)! 


= k! (Ω yk 
k 


28. (a) Find f’(x) if f(x) = |x|8. Find f’’(x). Does f’’’(x) exist for all x ἢ 
(b) Analyze f similarly if f(x) = x4 for x > Oand f(x) = —x‘ for x < 0. 
*29. Let f(x) = x” for x > 0 and let f(x) = 0 for x < 0. Prove that μι Ὁ 
exists (and find a formula for it), but that f((0) does not exist. 


xk 


S,”(x) = 


30. Interpret the following specimens of Leibnizian notation; each is a 
restatement of some fact occuring in a previous problem. 
: dx” at 
ᾳ) — = μχῖ τ, 


εἶχ 


(il) 


(iit) 


(iv) 


(v) 


(vi) 


(vii) 


(viii) 


(ix) 


(x) 


Derivatives 


dz 1. 1 
—= -—-ifz=- 
dy γ ) 
a f(x) +] _ af), 
dx dx 
dtefle)] _ 416, 
ax dx 
dz ay. 
PE ge Py Ss 
dx εἶχ oa a 
oe = 3a’. 
εχ “τεαῖ 
afteta)| ἀ[0) 
dx z=b dx z=bt+a 
dfle)| —_ 410 
dx z=b dx z=eb 
afl) _ 410) 
dx ay th, eee 


ἜΣ 
ΕἼ = 1! ( xo e, 
dx* k 


143 


CHAPTER 


THEOREM 1 


PROOF 


THEOREM 2 


PROOF 


DIFFERENTIATION 


The process of finding the derivative of a function is called differentiation. From 
the previous chapter you may have the impression that this process is usually 
laborious, requires recourse to the definition of the derivative, and depends 
upon successfully recognizing some limit. It is true that such a procedure is 
often the only possible approach—if you forget the definition of the derivative 
you are likely to be lost. Nevertheless, in this chapter we will learn to differ- 
entiate a large number of functions, without the necessity of even recalling the 
definition. A few theorems will provide a mechanical process for differen tiating 
a large class of functions, which are formed from a few simple functions by the 
process of addition, multiplication, division, and composition. This description 
should suggest what theorems will be proved. We will first find the derivative 
of a few simple functions, and then prove theorems about the sum, products, 
quotients, and compositions of differentiable functions. The first theorem is 
merely a formal recognition of a computation carried out in the previous 
chapter. 


If f is a constant function, f(x) = c, then 


f(a) = Ὁ for all numbers a. 


fila) = tim τς ΠΛ Se ee 0.5 


h- 0 h-0 


The second theorem is also a special case of a computation in the last 
chapter. 


If f is the identity function, f(x) = x, then 


f(a) = 1 for all numbers a. 


Hay = i ΣῈ Δ) ΞΖ 


h—->O ἦ 
. ath—a 
= ny ee 
λ--.0 A 
=" ling = => 1: 
ho A A 


The derivative of the sum of two functions is just what one would hope— 
the sum of the derivatives. 


144 


Differentiation 145 


THEOREM 3 If f and g are differentiable at a, then f + g is also differentiable at a, and 


ig) (α) = [(α) +2). 


PROOF (f + g)'(a) = lim τ ἐς ἘΠ - + ol 
μα λα ἘῸ τ ea +h) -- [τὼ +2@) 
λ-90 ἀ 
= lim Peete ne a Sa COE 66 τ - ste] 
h—0 h h 
πῆρ τ ea ACD) ᾿ a) ee τ - 


Γ(α) + ς΄ ().} 


The formula for the derivative of a product is not as simple as one might 
wish, but it is nevertheless pleasantly symmetric, and the proof requires only a 
simple algebraic trick, which we have found useful before—a number is not 
changed if the same quantity is added to and subtracted from it. 


THEOREM 4 _ If f and g are differentiable at a, then f : g is also differentiable at a, and 


(f° g)'(a) = f(a): gla) + f(a) " σ΄ (α). 


PROOF — (fg)'(a) = lim ({" 8)(α +h) -- (δ᾽ 4)() 


ἠ 
aia fat Beles Moe) 
h->0 
= iy [Met Hele + ed), [Κα + = fot 
eens lim $4 + 2 — g(a) 1: Lim fla ΤΩ — fla) “lim g(@ 


= f(a) - g(a) + Γ (α) " g(a). 


(Notice that we have used Theorem 9-1 toconclude that lim f(a + h) = f(a).) J 
h-0 


In one special case Theorem 4 simplifies considerably: 


THEOREM 5 _ If g(x) = cf(x) and f is differentiable at a, then g is differentiable at a, and 
g’(a) = ¢° f'(a). 
PROOF If A(x) = ¢, so that g = A‘ f, then by Theorem 4, 
g’(a) = (A: f)'(a) 
h(a): f'(a) + Λ΄ (α) - f(@) 


= δ᾿ f(a) + 0° fla) 
c+ f’(a).[ 


146 Derivatives and Integrals 


THEOREM 6 


PROOF 


Notice, in particular, that (—/f)’(a2) = —/’(a), and consequently (f — g)’(a) 
= (f+ [-g)"@) = Κ -- ea). 
To demonstrate what we have already achieved, we will compute the 
derivative of some more special functions. 
If f(x) = x" for some natural number n, then 
f(a) = πα} for all a. 
The proof will be by induction on n. For n = 1 this is simply Theorem 2. Now 
assume that the theorem is true for n, so that if f(x) = x”, then 
f(a) = na” for all a. 
Let g(x) = x"t1. If I(x) = x, the equation x"t! = x” +x can be written 
g(x) = f(x): T(x) for all x; 
thus g = f: 1. It follows from Theorem 4 that 


g(a) = ([᾿ ἢ)΄ (α) = f'@ - 1(4) + fla) - L@) 
= μα '+a+a™:l 
= na” + a” 


= (n+ 1)a", for all a. 


This is precisely the case n + 1 which we wished to prove. J 


Putting together the theorems proved so far we can now find {5} for f of the 
form 


F(X) = Gat” ΞΕ aqui”) τ τον aon? aw + ἄν. 
We obtain 
fe) = tag oe aga a ee ee ae 
We can also find f”: 
f(x) = nln — 1)α,χ" 5 + (n — 1)(n — Zann” + ++ + + ar. 


This process can be continued easily. Each differentiation reduces the highest 
power of x by 1, and eliminates one more a;. It is a good idea to work out the 
derivatives 7}, f‘, and perhaps f‘®, until the pattern becomes quite clear. 
The last interesting derivative is 


f™(x) = nian; 


109 (x) 


Clearly, the next step in our program is to find the derivative of a quotient 
f/g. It is quite a bit simpler, and, because of Theorem 4, obviously sufficient 
to find the derivative of 1/g. 


for k > n we have 


I 


0. 


THEOREM 7 


PROOF 


THEOREM 8 


Differentiation 147 


If g is differentiable at a, and g(a) # 0, then 1/g is differentiable at a, and 


(ὁ War 


Before we even write 
()er-C)o bala (2) @ 


we must be sure that this expression makes sense—it is necessary to check that 
(1/g)(a + A) is defined for sufficiently small 4. This requires only two observa- 
tions. Since g is, by hypothesis, differentiable at a, it follows from Theorem 9-1 
that g is continuous at a. Since g(a) ~ 0, it follows from Theorem 6-3 that 
there issome ὃ > Osuch that g(a + h) ¥ 0 for |A| < ὃ. Therefore (1/g)(2 + h) 
does make sense for small enough A, and we can write 


1 1 1 1 
om ῳ Ἐπ ( ἵ = im 4 ἢ 8@) 
h-0 h h—->0 h 
im £62 -- ε(α + 4) 
0 A[g(a) - g(a + λ)} 
~[ela+A)— g(a), 1 


ὼ = h g(a)g(a + h) 

«ας ees) ea) 4: 1 

7 νὴ hy ae g(a): g(a +h) 
- τῶ TR 


(Notice that we have used continuity of g at a once again.) ἢ 


The general formula for the derivative of a quotient is now easy to derive. 
Though not particularly appealing, it is important, and must simply be 
memorized (I always use the incantation: “bottom times derivative of top, 
minus top times derivative of bottom, over bottom squared.”’) 


If f and g are differentiable at a and g(a) ¥ 0, then f/g is differentiable at a, 
and 


(a) * f'(a) — fla) - e'(a) 
(2) (a) = (ea)? 


148 Derivatives and Integrals 


PROOF 


Since f/g = f° (1/g) we have 


ΠΡΌ ἢ 


= fa) () ὦ +1@-(Z) @ 


g 
_f@ , fal-s'@) 
g(a) [g(a)}? 
_ fla) g(a) — fla) ὦ ἢ 
[g(a)]? 
We can now differentiate a few more functions. For example, 
ed (x? + 1)(2x) — (x? — 1)(2x) Ax 
f -Ξ 3 h ᾿ - Eee ΞΞΞ ----------------- 
if f(x) = ἔτ τον then ΚῸ) oat ao 
᾿ x (x? + 1) — x(2x) 1 -- x? 
f = ——-——_» th : = --------ς--ς---- = “τ Σ 
if f(x) = A then 7") rae meee 
if f(x) = " then f'(x) = -- πω (--1)χ 3. 
x x 
Notice that the last example can be generalized: if 
fe) =x" = = for some natural number ἢ, 
x 
then 
’ —nx"} pe. 
i's) = a = (oa 
x 


thus Theorem 6 actually holds both for positive and negative integers. If we 
interpret f(x) = x° to mean f(x) = 1, and f’(x) = 0° x7’ to mean f (x) = 9, 
then Theorem 6 is true for n = 0 also. (The word “‘interpret”’ is necessary 
because it is not clear how 0° should be defined and, in any case, 0 + 0" is 
meaningless. ) 

Further progress in differentiation requires the knowledge of the derivatives 
of certain special functions to be studied later. One of these is the sine function. 
For the moment we shall divulge, and use, the following information, without 
proof: 

sin’(a) = cos a for all a, 
cos’(a) = — sina for all a. 


This information allows us to differentiate many other functions. For example, 
if 
7(Χ) = x sin x, 
then 
f'(x) = x cos x + sin x, 
f(x) = —x sin x + cos x + cos x 
= —yx sin x + 2 COS x; 


Differentiation 149 


if 
g(x) = sin?’ x = sin x° sin x, 
then 
g(x) = sin x cos x + cos x sin x 
= 2 sin x COs x, 
g(x) = 2[(sin x)(— sin x) + cos x cos x] 
= 2[cos? x — sin? x]; 
if 
h(x) = cos? x = cos x* cos x, 
then 
h’(x) = (cos x)(— sin x) + (— sin x) cos x 
= —2 sin x (08 x, 
h!'(x) = —2[cos? x — sin? x]. 


Notice that 
g'(x) + A'(x) = 0, 
hardly surprising, since (g + /)(x) = sin? x + cos? x = 1. As we would 
expect, we also have g’’(x) + A’’(x) = 0. 
The examples above involved only products of two functions. A function 
involving triple products can be handled by Theorem 4 also; in fact it can be 
handled in two ways. Remember that f+ σ΄ A is an abbreviation for 


(Γ 4) or f(g: ἢ). 
Choosing the first of these, for example, we have 
(fog h)'(x) = (f- g)'(x) " λα) + (Fg) A(x) 
= [fF x)gx) + flxda’(x) Jae) + fade ()λ΄ () 
= fi(x)g(x)a(x) + flxda(x)ax) + f(x) (x) A"). 
The choice of f: (g* Δ) would, of course, have given the same result, with a 
different intermediate step. The final answer is completely symmetric and 
easily remembered: 
(f:-g-h)’ is the sum of the three terms obtained by differentiating each 
of f, g, and A and multiplying by the other two. 
For example, if 
f(x) = χϑ sin x cos x, 
then 
f(x) = 3x? sin x cos x + x? cos x cos x + x3 (sin x)(— sin x). 
Products of more than 3 functions can be handled similarly. For example, you 
should have little difficulty deriving the formula 
(fo gh: k)'(x) = f(x) gx)h(x)k(x) + f(x) a’ a)A(x) k(x) 
+ f(x) gn’ (x)k(x) + fx) φα)λα λ΄ Ὁ]. 


You might even try to prove (by induction) the general formula: 


(fi: e 8 fn)’ (x) = > μα) te eat ge Se fix) fi (x) frre) - ae -fn(). 


150 Derivatives and Integrals 


Differentiating the most interesting functions obviously requires a formula 
for (f° g)’(x) in terms of f’ and σ΄. To ensure that fo g be differentiable at a, 
one reasonable hypothesis would seem to be that g be differentiable at a. Since 
the behavior of f o g near a depends on the behavior of f near g(a) (not near a), 
it also seems reasonable to assume that f is differentiable at g(a). Indeed we 
shall prove that if g is differentiable at a and / is differentiable at g(a), then 
fog is differentiable at a, and 


(f° 4) (α) = f'(g(a)) " a’). 


This extremely important formula is called the Chain Rule, presumably because 
a composition of functions might be called a ‘“‘chain”’ of functions. Notice that 
(f ο g)’ is practically the product of Κ΄ and g’, but not quite: f/’ must be evalu- 
ated at g(a) and g’ at a. Before attempting to prove this theorem we will try a 


few applications. Suppose 
f(x) = sin x?. 


Let us, temporarily, use δ᾽ to denote the (‘‘squaring”’) function S(x) = x”. Then 


f=sineS, 


Therefore we have 


sin’(S(x)) +S’ (x) 


= cos x2° 2x. 


fe) 


Quite a different result is obtained if 


f(x) = sin? x. 
In this case 
f = Sosin, 
50 
f(x) = S’(sin x) + 51π΄ (x) 


= 2 sin x° COS Xx. 


Notice that this agrees (as it should) with the result obtained by writing 
f = sin: sin and using the product formula. 

Although we have invented a special symbol, S, to name the “squaring” 
function, it does not take much practice to do problems like this without 
bothering to write down special symbols for functions, and without even 
bothering to write down the particular composition which f is—one soon 
becomes accustomed to taking f apart in one’s head. The following differ- 
entiations may be used as practice for such mental gymnastics—if you find it 
necessary to work a few out on paper, by all means do so, but try to develop 
the knack of writing f’ immediately after seeing the definition of f; problems of 
this sort are so simple that, if you just remember the Chain Rule, there is no 
thought necessary. 


if f(x) = sin x? then f’(x) = cos x* + 3x? 
f(x) = sin’ x f(x) = 3 sin? x: cos x 


Differentiation 151 


ὃν τ = 
f(x) = sin — f' (x) = cos ἫΝ =) 
x xo Nee 
f(x) = sin(sin x) f'(x) = cos(sin x) - cos x 
f(x) = sin(x? + 3x?) f’(x) = cos(x? + 3x?) - (3x? + 6x) 
f(x) = (18 + 3x?) 58 f'(x) = 53 (x3 + 3.χ3)52. (3x? + 6x). 
A function like 
f(x) = sin? x? = [sin x?]?, 
which is the composition of three functions, 
f=SoesinoS, 
can also be differentiated by the Chain Rule. It is only necessary to remember 
that a triple composition f° g oh means (f° g) of or fo (g oh). Thus if 
f(x) = sin? x? 
we can write 
f = (Sesin) ο S, 
f = So (sin o S$). 
The derivative of either expression can be found by applying the Chain Rule 
twice; the only doubtful point is whether the two expressions lead to equally 
simple calculations: As a matter of fact, as any experienced differentiator 
knows, it is much better to use the second: 


We can now write down f’(x) in one fell swoop. To begin with, note that the 
first function to be differentiated is S$, so the formula for f’(x) begins 

fe) =2 )- nT 
Inside the parentheses we must put sin x?, the value at x of the second 
function, sin ° S. Thus we begin by writing 


f(x) = 2 sin x? - 


(the parentheses weren’t really necessary, after all). We must now multiply 
this much of the answer by the derivative of sin o S at x; this part is easy—it 
involves a composition of two functions, which we already know how to 
handle. We obtain, for the final answer, 


f'(*) = 2 sin x? + cos x? + 2x. 


The following example is handled similarly. Suppose 


f(x) = sin(sin x?). 


Without even bothering to write down f as a composition g oho k of three 
functions, we can see that the left-most one will be sin, so our expression for 
f' (x) begins 

| f(x) = οὐδ. )': 


152 Derivatives and Integrals 


Inside the parenthesis we must put the value of ἡ © k(x); this is simply sin x? 
(what you get from sin(sin x?) by deleting the first sin). So our expression for 
f' (x) begins 

f'(x) = cos(sin x?) +) 


We can now forget about the first sin in sin(sin x*); we have to multiply what 
we have so far by the derivative of the function whose value at x is sin x*— 
which is again a problem we already know how to solve: 


f'(x) = cos(sin x?) - cos x? * 2x. 


Finally, here are the derivatives of some other functions which are the com- 
position of sin and SS, as well as some other triple compositions. You can proba- 
bly just “‘see” that the answers are correct—if not, try writing out f as a 
composition: . 


if f(x) = sin((sin x)?) then f’(x) = cos((sin x)?) +2 ze x°* COS Xx 


f(x) = [sin(sin ~)]? f'(x) = 2 sin(sin x) " cos(sin x) + cos x 
f(x) = sin(sin(sin x)) f'(x) = cos(sin(sin x)) " cos(sin x) * cos x 
f(x) = sin?(x sin x) f'(x) = 2 sin(x sin x) " cos(x sin x) 


) ‘(sin x + x cos x] 
f(x) = sin(sin(x? sin x)) f’(x) = cos(sin(x? sin x)) 
-cos(x? sin x) + [2x sin x + x? cos x]. 


The rule for treating compositions of four (or even more) functions is easy — 
always (mentally) put in parentheses starting from the right, 


fe(ge(heok)), 


and start reducing the calculation to the derivative of a composition of a 
smaller number of functions: 


f(e(h(h(x)))) -ἴ 


For example, if 


f(x) = sin?(sin?(x)) [f = Sosine So sin 
= §o (sin o (So sin))| 


then 
f'(x) = 2 sin(sin? x) " cos(sin? x) + 2 sin x * cos x; 
if 
f(x) = sin((sin x?)?) [f = sinoSosineS 
= sin o (So (sin o S))| 
then 
ΕΑ) = cos((sin x?)”) - 2 sin x? + cos x? + 2x; 
if 
f(x) = sin?(sin(sin x)) [fill in yourself, if necessary] 
then 


f'(x) = 2 sin(sin(sin x)) - cos(sin(sin x)) " cos(sin x) * cos x. 


Differentiation 153 


With these examples as reference, you require only one thing to become a 
master differentiator—practice. You can be safely turned loose on the exercises 
at the endof the chapter, and it is now high time that we proved the Chain Rule. 

The following argument, while not a proof, indicates some of the tricks one 
might try, as well as some of the difficulties encountered. We begin, of course, 
with the definition— 


δ ao re 6 ECP 


h-0 h 
25 lina flela + 4)) — Fg), 
h-0 h 


Somewhere in here we would like the expression for g’(a). One approach is to 
put it in by fiat: 


lim 184 + 4)) — f(g(@)) _ 5, Alga + A) — Κ(ξ(α)) , g(a + A) — g(a), 
τὰ h 0 g(a +h) — g(a) h 


This does not look bad, and it looks even better if we write 


lim (42 8)(¢ + — (fe g)(a) 
h-0 
τ Jim (gt) + lela Ὁ 4) -- g(a))) — [(6(4)) |, ga + ἢ) — gla), 
h—0 g(a +h) — g(a) h—0 h 


The second limit is the factor g’(a) which we want. If we let g(a +h) — g(a) =k 
(to be precise we should write £()), then the first limit is 


am 1686) +4) — fe@)! 


h—0 k 


It looks as if this limit should be f’(g(a)), since continuity of g at a implies that 
k goes to 0 as h does. In fact, one can, and we soon will, make this sort of rea- 
soning precise. There is already a problem, however, which you will have 
noticed if you are the kind of person who does not divide blindly. Even for 
h # 0 we might have g(a + h) — g(a) = 0, making the division and multipli- 
cation by g(a + h) — g(a) meaningless. True, we only care about small A, 
but σία + Δ) — g(a) could be 0 for arbitrarily small ἢ. The easiest way this 
can happen is for g to be a constant function, g(x) = c. Then g(a + h) — 
g(a) = Ofor all £. In this case, f ο g is also a constant function, (fo g)(x) = f(e), 
so the Chain Rule does indeed hold: 


(fe g)'(a) = 0 = f’(g(a)) " g’(a). 


However, there are also nonconstant functions g for which g(a + Δ) — (a) = 0 
for arbitrarily small h. For example, if a = 0, the function g might be 


x? sin ᾿ χφεῸ 
g(x) = x 
0, χ Ξ (. 


154 Derivatives and Integrals 


THEOREM 9 (THE CHAIN RULE) 


PROOF 


In this case, g’(0) = 0, as we showed in Chapter 9. If the Chain Rule is correct, 
we must have (fo g)’(0) = 0 for any differentiable f, and this is not exactly 
obvious. A proof of the Chain Rule can be found by considering such recalci- 
trant functions separately, but it is easier simply to abandon this approach, and 
use a trick. 


If g is differentiable at a, and f is differentiable at g(a), then f o g is differenti- 
able at a, and 


(feo g)'(a) = f'(g(a)) " g’(a). 
Define a function @ as follows. 


flg(a +A) —f(e@) | 
φίλ) = | τ ἢ Ξε » if g(a +h) g(a) χε 0 
f'(g(a)), if σία +h) — g(a) = 0. 
It should be intuitively clear that ¢@ is continuous at 0: When A/ is small, 
g(a + A) — g(a) is also small, so if g(a + h) — g(a) is not zero, then (A) will 
be close to f’(g(a)); and if it is zero, then $(h) actually equals f’(g(a)), which 
is even better. Since the continuity of φ is the crux of the whole proof we will 
provide a careful translation of this intuitive argument. 
We know that f is differentiable at g(a). This means that 


k0 k 


Thus, if ¢ > 0 there is some number 6’ > 0 such that, for all k, 


f(g(a) + ἢ — f(g(a)) 
k 


(1) if 0 < [k| < 6’, then — f'(g(a))| <8. 


Now g is differentiable at a, hence continuous at a, so there is a 6 > 0 such 
that, for all A, 
(2) if [ἡ] < ὃ, then |g(a+ A) 88 g(a)| « δ΄. 


Consider now any ᾧ with ἢ) < ὃ. Ifk = g(a +h) — g(a) ¥ 0, then 
f(gla th) —flg@) _ fle@ +4) — fe), 


g(a+h)— gla) k 
it follows from (2) that |k| < 6’, and hence from (1) that 
Ip(h) — f’(g(a))| « ε. 


On the other hand, if g(a +h) — g(a) = 0, then ¢(h) = f’(g(a)), so it is 
surely true that 


b(h) = 


Ip(h) — f’(g(a))| < ε. 
We have therefore proved that 
lim oh) = f'(g(@), 


Differentiation 155 


so @ is continuous at 0. The rest of the proof is easy. If h ~ 0, then we have 
flea th) — Ὁ) ath) -- 24) 
---΄ “““,͵͵. = φ(ἀ) .-"5------ 5"--΄ 
h h 
even if g(a + Δ) — g(a) = 0 (because in that case both sides are 0). Therefore 


(98) ὦ = lim fg@ + Ὁ - flg(@)) ε(α -ἰ 5 — g(a) 


= f'(g(a)) " σ΄ (α) "ἢ 


= lim $(h) " lim 
h-0 h-0 


Now that we can differentiate so many functions so easily we can take 
another look at the function 


ote E x #0 
= x 
0, x= 0. 


In Chapter 9 we showed that f’(0) = 0, working straight from the definition 
(the only possible way). For x σέ 0 we can use the methods of this chapter. 
We have | 

1 1 1 
f'(x) = 2x sin - + x? cos—- (- a); 
x x 2 


Xx 


Thus 
ae — red x #0 
F(x) Ἐπ x x 
0, x = 0. 


As this formula reveals, the first derivative f’ is indeed badly behaved at 0— 
it is not even continuous there. If we consider instead 


1 
re ree 
κὸ =|) ΠΣ x #0 
0, x = Q, 
then 


x x 


fils) = ιν ΠΤ: x #0 
0, x = 0. 


In this case f’ is continuous at 0, but f’’(0) does not exist (because the expres- 
sion 3x? sin 1/x defines a function which is differentiable at 0 but the expres- 
sion —.x cos 1/x does not). 

As you may suspect, increasing the power of x yet again produces another 
improvement. If 


7G) = i sin ᾿ x #0 
= x 
0, x = 0, 


156 Derivatives and. Integrals 


then 
ἜΝ: - cou x 0. 
f(x) = x x 
0, x = 0. 


It is easy to compute, right from the definition, that (f’)’(0) = 0, and f’’(x) is 
easy to find for x ¥ 0: 


A 1 1 . 1 
γῷ = {15 εν τόνος τ ἄνουν, τι np x #0 


0, x= 0. 


In this case, the second derivative f’’ is not continuous at 0. By now you may 
have guessed the pattern, which two of the problems asks you to establish: if 


1 
x?" sin-, x ¥ 0 

f(x) -| x 
0, x = 0, 


then f’(0), . . . , f™(0) exist, but f™ is not continuous at 0; if 


1 

5 a ooh a ΤΣ: 
joy =|} one x €¥ 0 
0, x = 0, 


then Κ΄ (0), . . . , f((0) exist, and f is continuous at 0, but f is not differ- 
entiable at 0. These examples may suggest that ‘‘reasonable” functions can be 
characterized by the possession of higher-order derivatives—no matter how 
hard we try to mask the infinite oscillation of f(x) = sin 1/x, a derivative of 
sufficiently high order seems able to reveal the underlying irregularity. Un- 
fortunately, we will see later that much worse things can happen. 

After all these involved calculations, we will bring this chapter to a close 
with a minor remark. It is often tempting, and seems more elegant, to write 
some of the theorems in this chapter as equations about functions, rather than 
about their values. Thus Theorem 3 might be written 


Pe) Sy ἘΠ 

Theorem 4 might be written as 

(Pe), ον, Μ᾿ 
and Theorem 9 often appears in the form 

eg) (ee) ee. 

Strictly speaking, these equations may be false, because the functions on the 
left-hand side might have a larger domain than those on the right. Neverthe- 
less, this is hardly worth worrying about. If f and g are differentiable every- 


where in their domains, then these equations, and others like them, are true, 
and this is the only case any one cares about. 


Differentiation 157 


PROBLEMS 


1, 


As a warm up exercise, find f’(x) for each of the following f. (Don’t worry 
about the domain of f or f’; just get a formula for f’(x) that gives the right 
answer when it makes sense.) 


(i) f(x) = sin(« + x?). 
(ii) f(x) = sin x + sin x?. 
(iii) f(x) = sin(cos x). 


(iv) f(x) = sin(sin x). 
. {cos x 
Ὁ) f(x) = sin (5353) 


sin(cos x) 


Wi) fe) = = 
(vii) f(x) = sin(x + sin ~). 


(viii) f(x) = sin(cos(sin x)). 


Find f’(x) for each of the following functions f. (It took the author 20 
minutes to compute the derivatives for the answer section, and it should 
not take you much longer. Although rapid calculation is not the goal of 
mathematics, if you hope to treat theoretical applications of the Chain 
Rule with aplomb, these concrete applications should be child’s play— 
mathematicians like to pretend that they can’t even add, but most of 
them can when they have to.) 


(ἃ) f(x) = sin((x + 1)? + 2)). 
(ii) f(x) = sin?(x? + sin x). 
(iii) f(x) = sin?((x + sin x)?’). 


(iv) f(x) = sin ( Σ ) 


cos x3 
(v) f(x) = sin(x sin x) + sin(sin x?). 
(vi) f(x) = (cos x)%", 
(vii) f(x) = sin? x sin x? sin? x?. 
(viii) f(x) = sin3(sin?(sin x)). 
(ix) f(x) = (x + sin® x)® 
(x) f(x) = sin(sin(sin(sin(sin x)))). 
(xi) f(x) = sin((sin’? x7 + 1)7). 
(xii) f(x) = (((x? + x)3 + x)4 + χ)5. 
(xiii) f(x*) = sin(x? + sin(x? + sin x?)). 
(xiv) f(x) = sin(6 cos(6 sin(6 cos 6x))). 

sin x? sin? x 

ON) f= 1 + sin x 


(xvi) f(x) = —— 


158 


Derivatives and Integrals 


es 

(xvii) f(x) = sin Ἔ = ) 

; ( , 

(xviii) f(*) = sin (, ἜΝ ( x ) 
x — sin x 


Find the derivatives of the functions tan, cotan, sec, and cosec. (You 
don’t have to memorize these formulas, although they will be needed 
once in a while; if you express your answers in the right way, they will be 
simple and somewhat symmetrical.) 

For each of the following functions f, find f’(f(x)) (not (f° f)’(x)). 


: ιν tw 
(i) ἢ aaranriss 


(ii) f(*) = sin x. 
(ili) f(x) = x. 
(iv) f(x) = 17. 


For each of the following functions f, find f(/’(x)). 


Ce OLE 


(ii) f(x) = x? 
(iii) f(x) = 17. 
(iv) f(x) = 17x. 


Find f’ in terms of g’ if 


() f(x) = g( + g(a). 
(ii) f(x) = σὰ : g(a). 
(iii) f(x) = g(x + g(x)). 
(iv) f(x) = g(x)@ — a). 
(v) f(x) = σία)" — a). 
(vi) f(x + 3) = g(x’). 


(a) A circular object is increasing in size in some unspecified manner, 
but it is known that when the radius is 6, the rate of change of the 
radius is 4. Find the rate of change of the area when the radius 15 6. 
(If r(t) and A(é) represent the radius and the area at time ἐ, then the 
functions 7 and A satisfy A = mr?; a straightforward use of the Chain 
Rule is called for.) 

(b) Suppose that we are now informed that the circular object we have 
been watching is really the cross section of a spherical object. Find 
the rate of change of the volume when the radius is 6. (You will clearly 
need to know a formula for the volume of a sphere; in case you have 
forgotten, the volume is $m times the cube of the radius.) 


10. 


11. 


12. 


13. 


14. 


15. 


Differentiation 159 


(c) Now suppose that the rate of change of the area of the circular 
cross section is 5 when the radius is 3. Find the rate of change of the 
volume when the radius is 3. You should be able to do this problem 
in two ways: first, by using the formulas for the area and volume in 
terms of the radius; and then by expressing the volume in terms of 
the area (to use this method you will need Problem 9-3). 

Let f(x) = x? sin 1/x for x # 0, and let f(0) = 0. Suppose also that ἡ and 

k are two functions such that 


h'(x) = sin?(sin(x + 1)) k(x) = f(x + 1) 
h(0) = 3 k(0) = 0. 
Find 


(i) (fe A)’(0). 
(ii) (ke f)"(0). 


(iii) a’ (x?), where a(x) = h(x”). Exercise great care. 
Find f’(0) if 
fis [ sin : x #0 


0, x = 0, 
and 
g(0) = g'(0) = 0. 


Using the derivative of f(x) = 1/x, as found in Problem 9-1, find 

(1/g)'(x) by the Chain Rule. 

(a) Using Problem 9-3, find f’(x) for —1 <x <1, if f(x) = νΊ — x*. 

(b) Prove that the tangent line to the graph of f at (a, 4/1 — a?) inter- 
sects the graph only at that point (and thus show that the elementary 
geometry definition of the tangent line coincides with ours). 

Prove similarly that the tangent lines to an ellipse or hyperbola intersect 

these sets only once. 

If f + g is differentiable at a, are f and g necessarily differentiable at a ? 

If f-g and f are differentiable at a, what conditions on f imply that g is 

differentiable at a? 

(a) Prove that if f is differentiable at a, then | f| is also differentiable at a, 
provided that f(a) ¥ 0. 

(b) Give a counterexample if f(a) = 0. 

(c) Prove that if f and g are differentiable at a, then the functions 
max(f, g) and min(f, g) are differentiable at a, provided that 
f(a) ¥ g(a). 

(d) Give a counterexample if f(a) = g(a). 

Suppose that f(a) and g(a) exist. Prove Leibniz’s formula: 


(f-8) (a) = y (τ g°-®(@. 


k= 


160 Derivatives and Integrals 


*16. 


17. 


18. 


19. 


20. 


21. 


"22. 


Prove that if f/™(g(a)) and g™ (a) both exist, then (f° g)‘(a) exists. A 
little experimentation should convince you that it is unwise to seek a 
formula for (f° g)‘(a). In order to prove that (f° g)(a) exists you 
will therefore have to devise a reasonable assertion about (f° g)‘”(a) 
which can be proved by induction. Try something like: “(f° g)"(a) 
exists and is a sum of terms each of which is a product of terms of the 
OTM 9 hae a 


(a) Iff(x) = anx” + anix"”! + + + + + ao, find a function g such that 
g’ = f. Find another. | 
(b) If 
b b bin 
f(x) = — + — + He, 
Re κ x 
find a function g with g’ = f. 
(c) Is there a function 
b bin 
ile δ. τ a 


such that f’(x) = 1/x? 
Show that there is a polynomial function f of degree n such that 


(a) f’(x) = 0 for precisely n — 1 numbers «x. 

(b) f’(x) = 0 for no x, if n is odd. 

(c) f’(x) = 0 for exactly one x, if n is even. 

(d) f’(x) = 0 for exactly & numbers x, if n -- kis odd. 


(a) The number a is called a double root of the polynomial function f 
if f(x) = (x — a)*g(x) for some polynomial function g. Prove that 
a is a double root of f if and only if α is a root of both f and /’. 

(0) When does f(x) = ax? + bx + ¢ (a ¥ 0) have a double root? What 
does the condition say geometrically? 

If f is differentiable at a, let d(x) = f(x) — f’(a)(« — a) — f(a). Find 

d’(a). In connection with Problem 19, this gives another solution for 

Problem 9-20. 


This problem is a companion to Problem 3-6. Let ai, ... ,@, and 
δι, . . . , bd, be given numbers. 
(a) If, . . . , x, are distinct numbers, prove that there is a polynomial 


function f of degree 2n — 1, such that f(x,;) = f’(x;) = 0 fory ¥ 2, 
and f(x;) = a; and f’(x;) = 6; Hint: Remember Problem 19. 

(b) Prove that there is a polynomial function f of degree 2n — 1 with 
f(x:) = a; and f’(x;) = 6; for all 1. 


Suppose that a and ὁ are two consecutive roots of a polynomial function 
f, but that ἃ and ὁ are not double roots, so that we can write f(x) = 
(x — a)(x — b)g(x) where g(a) # 0 and g(d) ¥ 0. 


23. 


24. 


25. 


*26. 


27. 


*28. 


"29: 


Differentiation 161 


(a) Prove that g(a) and g(b) have the same sign. (Remember that a and 
b are consecutive roots.) 

(b) Prove that there is some number x with a < x < ὁ and f’(x) = 0. 
(Also draw a picture to illustrate this fact.) Hint: Compare the sign 
of f’(a) and f’(8). 

(c) Now prove the same fact, even if a and ὁ are multiple roots. Hint: 
If f(a) = (x — a)™(x — 6)"g(x) where g(a) γέ 0 and g(b) ¥ 0, con- 
sider the polynomial function A(x) = f’(x)/(« — a)™ (x — δ)5 1. 


This theorem was proved by the French mathematician Rolle, in con- 
nection with the problem of approximating roots of polynomials, but the | 
result was not originally stated in terms of derivatives. In fact, Rolle was | 
one of the mathematicians who never accepted the new notions of calcu- 
lus. This was not such a pigheaded attitude, in view of the fact that for 
one hundred years no one could define limits in terms that did not verge 
on the mystic, but on the whole history has been particularly kind to 
Rolle; his name has become attached to a much more general result, to 
appear in the next chapter, which forms the basis for the most important 
theoretical results of calculus. 

Suppose that f(x) = xg(x) for some function g which is continuous at 0. 
Prove that f is differentiable at 0, and find f’(0) in terms of g. 
Suppose f is differentiable at 0, and that f(0) = 0. Prove that f(x) = 
xg(x) for some function g which is continuous at 0. Hint: What happens 
if you try to write g(x) = f(x)/x? 

If f(x) = x~” for n in N, prove that 


(n-tk—- 1)! noe 
(ἀ — 1)! 


= (—1)*n! (” er i. x ""*, for x 0. 


f(x) = (-8 


Prove that it is impossible to write x = f(x)g(x) where f and g are 
differentiable and f(0) = g(0) = 0. Hint: Differentiate. 
What is f(x) if 


(a) f(x) = 1/(* — a)” ? 


*(b) f(x) = 1/(x? — 1} 


Let f(x) = x2" sin 1/xifx σέ 0, and let f(0) = 0. Prove that f’(0), . . . , 
f™ (0) exist, and that /™ is not continuous at 0. (You will encounter the 
same basic difficulty as that in Problem 15.) 

Let f(x) = x?"t! sin 1/xifx σέ 0, and let f(0) = 0. Prove that f’(0), ..., 
f™ (0) exist, that f™ is continuous at 0, and that f™ is not differentiable 
at 0. 


162 Derivatives and Integrals 


30. 


In Leibnizian notation the Chain Rule ought to read: 


af(g(x)) _ af) _ g(x), 
ax dy \y=o(r) ἐἰχ 


Instead, one usually finds the following statement: “Let y = g(x) and 
z = f(y). Then 
dz ἀξ dy” 


dx dy dx 
Notice that the z in dz/dx denotes the composite function f  g, while the 
z in dz/dy denotes the function /; it is also understood that dz/dy will be 
‘an expression involving y,” and that in the final answer g(x) must be 
substituted for y. In each of the following cases, find dz/dx by using this 
formula; then compare with Problem 1. 


(i) z=siny, y=x+ x? 

(ii) z=siny, y = cos x. 

(11) z = cosu, u = sin x. 

(iv) z=sinv, v=cosu, u =sinx. 


CHAPTER 


DEFINITION 


FIGURE 1 


THEOREM 1 


PROOF 


SIGNIFICANCE OF THE DERIVATIVE 


One aim in this chapter is to justify the time we have spent learning to find the 
derivative of a function. As we shall see, knowing just a little about [7 tells us a 
lot about f. Extracting information about f from information about f’ requires 
some difficult work, however, and we shall begin with the one theorem which 
is really easy. 

This theorem is concerned with the maximum value of a function on an 
interval. Although we have used this term informally in Chapter 7, it is worth- 
while to be precise, and also more general. | 


Let f be a function and A a set of numbers contained in the domain of f. 
A point x in A is a maximum point for f on 4, if 


f(x) Σ f(y) for every y in A. 


The number f(x) itself is called the maximum value of f on A (and we also 
say that f ‘has its maximum value on A at x’). 


Notice that the maximum value of f on A could be f(x) for several different x 
(Figure 1); in other words, a function f can have several different maximum 
points on A, although it can have at most one maximum value. Usually we 
shall be interested in the case where 4 is a closed interval [a, 6]; if f is continu- 
ous, then Theorem 7-3 guarantees that f does indeed have a maximum value 
on [α, ὁ]. 

The definition of a minimum of f on A will be left to you. (One possible 
definition is the following: f has a minimum on A at x, if —f hasa maximum 
on A at x.) : 

We are now ready for a theorem which does not even depend upon the 
existence of least upper bounds. 


Let f be any function defined on (a, δ). If x is a maximum (or a minimum) 
point for f on (a, δ), and f is differentiable at x, then f’(x) = 0. 

(Notice that we do not assume differentiability, or even continuity, of f at 
other points.) 


Consider the case where f has a maximum at x. (Figure 2 illustrates the simple 
idea behind the whole argument—secants drawn through points to the left of 
(x, f(x)) have slopes > 0, and secants drawn through points to the right of 
(x, f(x)) have slopes < 0.) Analytically, this argument proceeds as follows. 


163 


164 Derivatives and Integrals 


FIGURE 2 


FIGURE 3 


DEFINITION 


THEOREM 2 


PROOF 


If h is any number such that x - ἃ is in (a, δ), then 


f(x) 2 f(x + A), 
since f has a maximum on (a, ὁ) at x. This means that 


f(x + ἃ) — f(x) < 0. 
Thus, if h > 0 we have 
ἊΣ Τα 2 
h 
and consequently 


im 7 +4) -- 70) «0. 


h—0+ h 
On the other hand, if h < 0, we have 
fet N12 5 


50 


lim 
h-07- 


fle +h) — fl) , , 
a) 


By hypothesis, f is differentiable at x, so these two limits must be equal, in 
fact equal to f’(x). This means that 


f(x) <0 and f’(x*) > 0, 


from which it follows that f’(x) = 0. 
The case where f has a minimum at x is left to you (give a one-line proof). J 


Notice (Figure 3) that we cannot replace (a, δ) by [a, 5] in the statement of 
the theorem (unless we add to the hypothesis the condition that x is in (a, 6)). 

Since f’(x) depends only on the values of f near x, it is almost obvious how 
to get a stronger version of Theorem 1. We begin with a definition which is 
illustrated in Figure 4. 


Let f be a function, and A a set of numbers contained in the domain 
of f. A point x in A is a local maximum [minimum] point for f on A if 


there is some 6 > 0 such that x is a maximum [minimum] point for f on 


AM (x — δ, χ + δ). 


If f is defined on (a, ὁ) and has a local maximum (or minimum) at x, and f is 
differentiable at x, then f’(x) = 0. 


You should see why this is an easy application of Theorem 1. ἢ 


DEFINITION 


minimum point 


FIGURE 4 


maximum point 


Significance of the Derivative 165 


The converse of Theorem 2 is definitely not true—it is possible for f’(x) to 
be 0 even if x is not a local maximum or minimum point for f. The simplest 
example is provided by the function f(x) = x3; in this case ζ΄ (0) = 0, but fhas 
no local maximum or minimum anywhere. 

Probably the most widespread misconceptions about calculus are concerned 
with the behavior of a function f near x when f’(x) = 0. The point made in the 
previous paragraph is so quickly forgotten by those who want the world to 
be simpler than it is, that we will repeat it: the converse of Theorem 2 1s not 
true—the condition f’(x) = 0 does not imply that x is a local maximum or 
minimum point of f. Precisely for this reason, special terminology has been 
adopted to describe numbers x which satisfy the condition f’(x) = 0. 


A critical point of a function f is a number x such that 


7 6) = 0. 


The number f(x) itself is called a critical value of ἢ. 


The critical values of f, together with a few other numbers, turn out to be 
the ones which must be considered in order to find the maximum and mini- 
mum of a given function f. To the uninitiated, finding the maximum and 
minimum value of a function represents one of the most intriguing aspects of 
calculus, and there is no denying that problems of this sort are fun (until you 
have done your first hundred or so). 

Let us consider first the problem of finding the maximum or minimum of f 
on a closed interval [a, 6]. (Then, if f is continuous, we can at least be sure 
that a maximum and minimum value exist.) In order to locate the maximum 
and minimum of f three kinds of points must be considered: 


(1) The critical points of f in [a, 6]. 
(2) The end points a and ὁ. 
(3) Points x in [a, 6] such that f is not differentiable at x. 


If x is a maximum point or a minimum point for f on [a, δ], then x must be 
in one of the three classes listed above: for if x is not in the second or third 
group, then x is in (a, δ) and f ἐς differentiable at x; consequently f(x) = 0, 
by Theorem 1, and this means that x is in the first group. 

If there are many points in these three categories, finding the maximum and 
minimum of f may still be a hopeless proposition, but when there are only a 
few critical points, and only a few points where f is not differentiable, the 
procedure is fairly straightforward: one simply finds f(x) for each x satisfying 
f'(x) = 0, and f(x) for each x such that f is not differentiable at x and, finally, 
f(a) and f(b). The biggest of these will be the maximum value of f, and the 
smallest will be the minimum. A simple example follows. 


166 Derivatives and Integrals 


—_ _——— — ee oo 


—1 a 


FIGURE 5 


Suppose we wish to find the maximum and minimum value of the function 
f(x) =x? -- x 
on the interval [—1, 2]. To begin with, we have 
fe) = 3. 1, 
so f(x) = 0 when 3x? — 1 = 0, that is, when 
x= V1/3 or — V1/3. 


The numbers V'1/3 and — V1/3 both lie in [—1, 2], so the first group of 
candidates for the location of the maximum and the minimum is 


(1) V1/3, — W1/3. 
The second group contains the end points of the interval 
(2) —1, 2. 


The third group is empty, since f is differentiable everywhere. The final step 
is to compute 

f(V'1/3) = (V1/3)8 — V1/3 = ἐν} — V1/3 = -ΟὉῇ υ[}, 

f(— V1/3) = (— V1/3)§ — (— V1/3) = -$-V1/3 + V1/3 = ἢ 1,3, 
fl). 0, 
f(2) = 6. 
Clearly the minimum value is —2 V1 /3, occurring at V 1/3, and the maxi- 
mum value is 6, occurring at 2. 

This sort of procedure, if feasible, will always locate the maximum and 
minimum value of a continuous function on a closed interval. If the function 
we are dealing with is not continuous, however, or if we are seeking the maxi- 
mum or minimum on an open interval or the whole line, then we cannot even 
be sure beforehand that the maximum and minimum values exist, so all the 
information obtained by this procedure may say nothing. Nevertheless, a 
little ingenuity will often reveal the nature of things. In Chapter 7 we solved 
just such a problem when we showed that if n is even, then the function 


F(x) = x™ + aniixe™ 1+ +++ + a9 


has a minimum value on the whole line. This proves that the minimum value 
must occur at some number x satisfying 


0 = f(x) = nx™ 1 + (n — 1)ag_xe™ 1 + «+ + + H+ a. 


If we can solve this equation, and compare the values of f(x) for such x, we 
can actually find the minimum of f. One more example may be helpful. Sup- 
pose we wish to find the maximum and minimun,, if they exist, of the function 


f(x) = 


1 — x? 


FIGURE 6 


Significance of the Derivative 167 


on the open interval (—1, 1). We have 


Wes ee 
f'(x) i! 


so f’(x) = Oonly for x = 0. We can see immediately that for x close to 1 or —1 
the values of f(x) become arbitrarily large, so f certainly does not have a maxi- 
mum. This observation also makes it easy to show that f has a minimum at 0. 
We just note (Figure 5) that there will be numbers a and ὁ, with 


—1<a<0 and 0<6<1, 
such that f(x) > f(0) for 
| —1<x<a and ὁ Φχ «Ί. 


This means that the minimum of f on [a, 6] is the minimum of f on all of 
(—1, 1). Now on [a, 4] the minimum occurs either at 0 (the only place where 
f = ‘0), or at a or ὁ, and aand ὁ have already been ruled out, so the minimum 
value is f(0) = 1. 

In solving these problems we purposely did not draw the graphs of f(x) = 
x? — x and f(x) = 1/(1 — x*), but it is not cheating to draw the graph 
(Figure 6) as long as you do not rely solely on your picture to prove anything. 
As a matter of fact, we are now going to discuss a method of sketching the 
graph of a function that really gives enough information to be used in discuss- 
ing maxima and minima—in fact we will be able to locate even /ocal maxima 
and minima. This method involves consideration of the sign of f'(x), and relies 
on some deep theorems. 

The theorems about derivatives which have been proved so far, always 
yield information about /’ in terms of information about f. This is true even | 
of Theorem 1, although this theorem can sometimes be used to determine 
certain information about f, namely, the location of maxima and minima. 
When the derivative was first introduced, we emphasized that f'(x) is not 
[f(x + A) — f(x)]/A for any particular ἡ, but only a limit of these numbers as h 
approaches 0; this fact becomes painfully relevant when one tries to extract 
information about f from information about f’. The simplest and most frus- 
trating illustration of the difficulties encountered is afforded by the following 
question: If f’(x) = 0 for all x, must: be a constant function? It is impossible 
to imagine how f could be anything else, and this conviction is strengthened 
by considering the physical interpretation—if the velocity of a particle is 
always 0, surely the particle must be standing still! Nevertheless it is difficult 
even to begin a proof that only the constant functions satisfy f’(x) = 0 for all 
x. The hypothesis f’(x) = 0 only means that | 


jim Le ὁ) — 10) = 0, 


and it is not at all obvious how one can use the information about the limit to 
derive information about the function. | 


FIGURE 7 


FIGURE 8 


THEOREM 3 (ROLLE’S THEOREM) 


FIGURE 9 


FIGURE 10 


a 


PROOF 


The fact that f is a constant function if f’(x) = 0 for all x, and many other 
facts of the same sort, can all be derived from a fundamental theorem, called 
the Mean Value Theorem, which states much stronger results. Figure 7 makes 
it plausible that if f is differentiable on [a, 6], then there is some x in (a, δ) such 
that 


f(x) = fe =I). 


b- 


Geometrically this means that some tangent line is parallel to the line between 
(a, f(a)) and (ὦ, f(6)). The Mean Value Theorem asserts that this is true— 
there is some x in (a, ὁ) such that f’(x), the instantaneous rate of change of f at 
x, is exactly equal to the average or “‘mean”’ change of f on [a, 6], this average 
change being [f(b) — f(a)]/[6 — a]. (For example, if you travel 60 miles in 
one hour, then at some time you must have been traveling exactly 60 miles per 
hour.) This theorem is one of the most important theoretical tools of calculus— 
probably the deepest result about derivatives. From this statement you might 
conclude that the proof is difficult, but there you would be wrong—the hard 
theorems in this book have occurred long ago, in Chapter 7. It is true that if 
you try to prove the Mean Value Theorem yourself you will probably fail, 
but this is neither evidence that the theorem is hard, nor something to be 
ashamed of. The first proof of the theorem was an achievement, but today we 
can supply a proof which is quite simple. It helps to begin with a very special 
case. 


If f is continuous on [a, 6] and differentiable on (a, 6), and f(a) = f(), then 
there is a number x in (a, 4) such that f’(x) = 0. 


It follows from the continuity of f on [a, 6] that f has a maximum and a mini- 
mum value on [a, ὁ]. 

Suppose first that the maximum value occurs at a point x in (a, 6). Then: 
f'(x) = 0 by Theorem 1, and we are done (Figure 8). 

Suppose next that the minimum value of f occurs at some point x in (a, ὁ). 
Then, again, f’(x) = 0 by Theorem 1 (Figure 9). 

Finally, suppose the maximum and minimum values both occur at the end 
points. Since f(a) = f(6), the maximum and minimum values of f are equal, 
so f is a constant function (Figure 10), and for a constant function we can 
choose any x in (a, ὁ). ἢ 


Notice that we really needed the hypothesis that f is differentiable every- 
where on (a, δ) in order to apply Theorem 1. Without this assumption the 
theorem is false (Figure 11). | 

You may wonder why a special name should be attached to a theorem as 
easily proved as Rolle’s Theorem. The reason is, that although Rolle’s 
Theorem is a special case of the Mean Value Theorem, it also yields a simple 
proof of the Mean Value Theorem. In order to prove the Mean Value 


FIGURE 11 


THEOREM 4 (THE MEAN VALUE 


(a f(a)) 


THEOREM) 


PROOF 


(ὁ, f(4)) 


FIGURE 12 


COROLLARY 1 


PROOF 


Significance of the Derivative 169 | 


Theorem we will apply Rolle’s Theorem to the function which gives the length 
of the vertical segment shown in Figure 12; this is the difference between f(x), 
and the height at x of the line LZ between (a, f(a)) and (ὁ, f(4)). Since L is the 


graph of 
ge) = [ΕΞ ΞΙ͂Ξ]ς - 0) +f), 
we want to look at 


τὼ (MP =L9 | - - fo), 


As it turns out, the constant f(a) 15 irrelevant. 


If f is continuous on [a, 6] and differentiable on (a, δ), then there is a number 
x in (a, δ) such that 


f(a) — fla) os f(a), 


— @ 


f(x) = 


Let 


na) = fe) - [P=] & - a. 


Clearly, A is continuous on [a, ὁ] and differentiable on (a, 6), and 


ἐπ ἢ; 
aio) = 7) - [A= ae aed en) 
Ξ 


Consequently, we may apply Rolle’s Theorem to ᾧ and conclude that there is 
some x in (a, δ) such that 


pane) =p - =, 
so that 
γὼ = [ἢ Ξ Κϑ 


Notice that the Mean Value Theorem still fits into the pattern exhibited by 
previous theorems—information about f yields information about f’. This 
information is so strong, however, that we can now go in the other direction. 


If f is defined on an interval and f’(x) = 0 for all x in the interval, then f 15 
constant on the interval. 


Let a and ὁ be any two points in the interval with a κέ ὁ. Then there is some x 


170 Derivatives and Integrals 


FIGURE 13 


COROLLARY 2 


PROOF 


DEFINITION 


COROLLARY 3 


PROOF 


in (a, 6) such that 


f(a) — f@), 


ie 4 


f(x) = 
But f’(x) = 0 for all x in the interval, so 


ΚΟ) -- Κα 


b-—a 


0 = 


and consequently f(a) = f(b). Thus the value of f at any two points in the 
interval is the same, i.e., f is constant on the interval. J 


Naturally, Corollary 1 does not hold for functions defined on two or more 
intervals (Figure 13). 


If f and g are defined on the same interval, and f’(x) = g’(x) for all x in the 
interval, then there is some number ὁ such that f = g + «. 


For all x in the interval we have (f — g)’(x) = f’(x) — g'(x) = 0 so, by 
Corollary 1, there is a number ¢ such that f — g = c. ἢ 


The statement of the next corollary requires some terminology, wae is 
illustrated in Figure 14. 


A function f is increasing on an interval if f(a) < f(b) whenever a and ὦ 
are two numbers in the interval with a < ὁ. The function f is decreasing 
on an interval if f(a) > f(6) for all a and ὁ in the interval with a < ὁ. (We 
often say simply that f is increasing or decreasing, in which case the interval 
is understood to be the domain of f.) 


If f(x) > 0 for all x in an interval, then f is increasing on the interval; if 
f(x) <0 for all x in the interval, then f is decreasing on the interval. 


Consider the case where f’(x) > 0. Let a and 6 be two points in the interval 
with a < ὁ. Then there is some x in (a, 6) with 


f(x) = AL), 
But f’(x) > 0 for all x in (a, δ), so 
A oa Kb) — fla) ς, 
b—a 


Since ὁ — a > 0 it follows that f(b) > f(a). 
The proof when f’(x) < 0 for all x is left to you. ἢ 


(a) an increasing function 


(b) a decreasing function 


FIGURE 14 


Significance of the Derivative 171 


Notice that although the converses of Corollary 1 and Corollary 2 are true 
(and obvious), the converse of Corollary 3 is not true. If f is increasing, it 15 
easy to see that f’(x) > 0 for all x, but the equality sign might hold for some x 
(consider f(x) = x). 

Corollary 3 provides enough information to get a good idea of the graph of 
a function with a minimal amount of point plotting. Consider, once more, the 
function f(x) = x? — x. We have 


f'(x) = 3x? — 1. 

We have already noted that f’(x) = Ὁ for x = V1/3 and x = N/T 7/3, and 
it is also possible to determine the sign of (ΑἹ for all other x. Note that 
3x2 — 1 > 0 precisely when 

3x? > 1, 

x? > ὁ, 

x>V1/3 or x < - V1/3; 

thus 3x? — 1 < 0 precisely when 


~V1/3 <x « V1/3. 


Thus f is increasing for x << - V'1/3, decreasing between — 1/3 and 
V1 /3, and once again increasing for x > V1 /3. Combining this information 


with the following facts 


(1) f(— 1/3) = ξ ¥1/3, 
f(V1/3) = —3-V1/3, 
(2) f(x) = 0 for x = —1, 0, 1, 
(3) f(x) gets large as x gets large, and large negative as x gets large 
negative, 


it is possible to sketch a pretty respectable approximation to the graph 
(Figure 15). 

By the way, notice that the intervals on which f increases and decreases 
could have been found without even bothering to examine the sign of f’. For 
example, since f’ is continuous, and vanishes only at —-V1 /3 and νι, 3, we 
know that f’ always has the same sign on the interval (— V1 713; V1 /3). Since 
f(—V 1/3) >f (V1 /3), it follows that f decreases on this interval. Similarly, 
f’ always has the same sign on (V1 73, ©) and f(x) is large for large x, so 7 
must be increasing on (V1 /3, ©). Another point worth noting: If f’ is con- 
tinuous, then the sign of f’ on the interval between two adjacent critical points 
can be determined simply by finding the sign of f’(x) for any one x in this 
interval. 

Our sketch of the graph of f(x) = x? — x contains sufficient information to 
allow us to say with confidence that V4 /3 is a local maximum point, and 


. ΜΊ,,3 a local minimum point. In fact, we can give a general scheme for 


172 Derivatives and Integrals 


f increasing 


—_—_—_—_—. 


| 
| 
| 
| 
| 
: f decreasing 
| 
| 
| 
| 


FIGURE 15 


deciding whether a critical point is a local maximum point, a local minimum 
point, or neither (Figure 16): 


(1) if f’ > 0 in some interval to the left of x and f’ < 0 in some interval 
to the right of x, then x is a local maximum point. 

(2) if f’ < 0 in some interval to the left of x and f’ > 0 in some interval 
to the right of x, then x is a local minimum point. 

(3) iff’ has the same sign in some interval to the left of x as it has in some 
interval to the right, then x is neither a local maximum nor a local 
minimum point. 


(There is no point in memorizing these rules—you can always draw the pic- 
tures yourself.) 

The polynomial functions can all be analyzed in this way, and it is even 
possible to describe the general form of the graph of such functions. To begin, 


x x 


<P <p —P <P 
fi >of <0 fa FSO) 
(a) (b) 


FIGURE 16 


FIGURE 17 


FIGURE 18 


Significance of the Derivative 173 


we need a result already mentioned in Problem 3-7: If 
7(Χ) Haga Gye aE a, 


then f has at most n “roots,” i.e., there are at most ἡ numbers x such that 
f(x) = 0. Although this is really an algebraic theorem, calculus can be used 
to give an easy proof. Notice that if x; and x2 are roots of f (Figure 17), so that 
f(x1) = f(x2) = 0, then by Rolle’s Theorem there is a number x between 
x, and x2 such that f’(x) = 0. This means that if f has k different roots 
x1 <x%2< τ" < x,, then f’ has at least k — 1 different roots: one between 
x; and x2, one between x2 and x3, etc. It is now easy to prove by induction that 
a polynomial function 


[OS Ga a ape ae 


has at most n roots: The statement is surely true for n = 1, and if we assume 
that it is true for n, then the polynomial 


g(x) -- bara + b nx” + be ae |. bo 


could not have more than n + 1 roots, since if it did, σ΄ would have more than 
n roots. 
With this information it is not hard to describe the graph of 


ΠΧ Se ge ag 


The derivative, being a polynomial function of degree n — 1, has at most 
n — 1 roots. Therefore f has at most n — 1 critical points. Of course, a critical 
point is not necessarily a local maximum or minimum point, but at any rate, 
if a and ὁ are adjacent critical points of f, then f’ will remain either positive or 
negative on (a, 6), since f’ is continuous; consequently, f will be either in- 
creasing or decreasing on (a, ὁ). Thus f has at most n regions of decrease or 
increase. 
As a specific example, consider the function 


f(x) = x* -- 2x’. 
Since 
[{) = 403 — 4x = 4x(e -- 1) 4+ 1), 


the critical points of f are —1, 0, and 1, and 


f(0) = 9, 
CD ees © 


The behavior of f on the intervals between the critical points can be deter- 
mined by one of the methods mentioned before. In particular, we could deter- 
mine the sign of /’ on these intervals simply by examining the formula for f’(x). 
On the other hand, from the three critical values alone we can see (Figure 18) 
that f increases on (—1, 0) and decreases on (0, 1). To determine the sign of 
f' on (— ο, —1) and (1, ©) we can compute 


174 


Derivatives and Integrals 


f'(—2) = 4: -ὴ -- 4: (—2) = --24, 
f'(2) - 4.28 -- 4:2 = 24, 


and conclude that f is decreasing on (— ©, —1) and increasing on (1, ©). 
These conclusions also follow from the fact that f(x) is large for large x and for 
large negative x. 

We can already produce a good sketch of the graph; two other pieces of 
information provide the finishing touches (Figure 19). First, it is easy to deter- 
mine that f(x) = 0 forx = 0, + V2: second, it is clear that f is even, f(x) = 
f(—«), so the graph is symmetric with respect to the vertical axis. The function 
f(x) = x3 — x, already sketched in Figure 15, is odd, f(x) = —f(—~), and is 
consequently symmetric with respect to the origin. Half the work of graph 
sketching may be saved by noticing these things in the beginning. 

Several problems in this and succeeding chapters ask you to sketch the 
graphs of functions. In each case you should determine 


(1) the critical points of ἢ, 

(2) the value of f at the critical points, 

(3) the sign of f’ in the regions between critical points (if this is not 
already clear), 

(4) the numbers x such that f(x) = 0 (if possible), 

(5) the behavior of f(x) as x becomes large or large negative (if possible). 


Finally, bear in mind that a quick check, to see whether the function is odd or 
even, may save a lot of work. 

This sort of analysis, if performed with care, will usually reveal the basic 
shape of the graph, but sometimes there are special features which require a 
little more thought. It is impossible to anticipate all of these, but one piece of 
information is often very important. If f is not defined at certain points (for 
example, if f is a rational function whose denominator vanishes at some 
points), then the behavior of f near these points should be determined. 

For example, consider the function 


fx) = xt 2x 2 
rm 


f(x) = x* — 2x’ 


FIGURE 19 


Significance of the Derivative 175 


which is not defined at 1. We have 
(x — 1)(2x — 2) — (x? — 2x + 2) 


ζῶ = ee 
_ x(x — 2). 
δ πὸ 
Thus 
(1) the critical points of f are 0, 2. 
Moreover, 
f(2) = 2. 


Because f is not defined on the whole interval (0, 2), the sign of f’ must be 
determined separately on the intervals (0, 1) and (1, 2), as well as on the 
intervals (— «©, 0) and (2, ©). We can do this by picking particular points in 
each of these intervals, or simply by staring hard at the formula for f’. Either 
way we find that 


(3) f(x) > 0 if x <0, 
fix) <0 if O<x <1, 
f(x) <0 if 1<x <2, 
fix) >0 if Ὅν: ΟΝ 


Finally, we must determine the behavior of f(x) as x becomes large or large 
negative, as well as when x approaches 1 (this information will also give us 
another way to determine the regions on which / increases and decreases). 
To examine the behavior as x becomes large we write 

2— 2 2 
ee ees 
x—1 x—l1 
clearly f(x) is close to x — 1 (and slightly larger) when x is large, and f(x) is 
close to x — 1 (but slightly smaller) when x is large negative. ‘The behavior 
of f near 1 is also easy to determine; since 


lim (x? — 2x + 2) = 1 #0, 
zl 


the fraction 
x? —2x+2 
x—l1 


becomes large as x approaches 1 from above and large negative as x approaches 
1 from below. 

All this information may seem a bit overwhelming; but there is only one 
way that it can be pieced together (Figure 20); be sure that you can account 
for each feature of the graph. | 

When this sketch has been completed, we might note that it looks like the 


176 Derivatives and Integrals 


THEOREM 5 


PROOF 


FIGURE 20 


graph of an odd function shoved over 1 unit, and the expression 


ikem 2χ 2 ( ee Oc 


x— 1 x—- 1 


shows that this is indeed the case. However, this is one of those special features 
which should be investigated only after you have used the other information 
to get a good idea of the appearance of the graph. 

Although the location of local maxima and minima of a function is always 
revealed by a detailed sketch of its graph, it is usually unnecessary to do so 
much work. There is a popular test for local maxima and minima which 
depends on the behavior of the function only at its critical points. 


Suppose f’(a) = 0. If f’’(a) > 0, then f has a local minimum at a; if f’’(a) < 0, 
then f has a local maximum at a. 


By definition, 

” a Aa yf (ἃ) 
ΞΕ, δ, Ὁ δ δύτός ἀξιοσι (ἃ... 
f(a) lim ; 
Since f’(a) = 0, this can be written 
ΘΕ, 

πο h 

Suppose now that f’’(a) > 0. Then f’(a + h)/A must be positive for sufficiently 
small h. Therefore: 


(c) 


FIGURE 21 


THEOREM 6 


PROOF 


Significance of the Derivative 177 


f'(a + h) must be positive for sufficiently small h > 0 
and f’(a + h) must be negative for sufficiently small h < 0. 


This means (Corollary 3) that f is increasing in some interval to the right of 
a and f is decreasing in some interval to the left of a. Consequently, f has a local 
minimum at a. 

The proof for the case f’’(a) < 0 is similar. J 


Theorem 5 may be applied to the function f(x) = x — x, which has already 
been considered. We have 
f'(x) = 3x? -— 1 
f' (x) = 6x. 


At the critical points, — V'1/3 and V1 /3, we have 


f'(— V1/3) = —6 V1/3 <0, 
f'(V1/3) = 6 V1/3 > 0. 


Consequently, — V'1/3 is a local maximum point and V1/3 is a local mini- 
mum point. 

Although Theorem 5 will be found quite useful for polynomial functions, 
the second derivative of many functions is so complicated that it is easier to 
consider the sign of the first derivative. Moreover, if a is a critical point of f it 
may happen that f’’(2) = 0. In this case, Theorem 5 provides no information: 
it is possible that a is a local maximum point, a local minimum point, or 
neither, as shown (Figure 21) by the functions 


10) = —x4, 7) = x8, 70.) = χ"; 


in each case f’(0) = f’’(0) = 0, but 0 is a local maximum point for the first, a 
local minimum point for the second, and neither a local maximum nor mini- 
mum point for the third. This point will be pursued further in Part IV. 

It is interesting to note that Theorem 5 automatically proves a partial con- 
verse of itself. 


Suppose {7 (α) exists. If f has a local minimum at a, then f’’(a) 2 0; if f has a 
local maximum at a, then f(a) < 0. 


Suppose f has a local minimum at a. If f’’(a) < 0, then f would also have a 

local maximum at a, by Theorem 5. Thus f would be constant in some interval 

containing a, so that f(a) = 0, a contradiction. ‘Thus we must have Γ΄ (a). 2 0. 
The case of a local maximum is handled similarly. J 


(This partial converse to Theorem 5 is the best we can hope for: the > 
and < signs cannot be replaced by > and <, as shown by the functions 
f(x) = χ' and f(x) = —x*.) 

The remainder of this chapter deals, not with graph sketching, or maxima 
and minima, but with three consequences of the Mean Value Theorem. The 


178 Derivatives and Integrals 


THEOREM 7 


PROOF 


FIGURE 22 


THEOREM 8 (THE CAUCHY MEAN 
VALUE THEOREM) 


first is a simple, but very beautiful, theorem which plays an important role in 
Chapter 15, and which also sheds light on many examples which have occurred 
in previous chapters. 


Suppose that f is continuous at a, and that f’(x) exists for all x in some interval 
containing a, except perhaps for x = a. Suppose, moreover, that lim /'(x) 


ra 


exists. Then f’(a) also exists, and 


f(a) = lim ΓΟ). 


By definition, 
f(a +h) — fa), 
h 


f(a) = lim 
h-0 


For sufficiently small 4 > 0 the function f will be continuous on [a, a + ἀ] and 
differentiable on (a, a+ A) (a similar assertion holds for sufficiently small 
h < 0). By the Mean Value Theorem there is a number a, in (a, a + h) such 
that 

fla +h) — fla) _ 


h f(a): 


Now a, approaches a as h approaches 0, because ay, is in (a, a + h); since 
lim f’(x) exists, it follows that 
f (a) — πες ΠΝ ᾿ i f(a) 
0 


= lim f’(a,) = lim f’(x). 
h- h-0 za 
(It is a good idea to supply a rigorous ¢-6 argument for this final step, which 
we have treated somewhat informally.) J 


Even if f is an everywhere differentiable function, it is still possible for /’ to 
be discontinuous. This happens, for example, if 


1 
x?sin- x €~0 

f(x) -| x 
0, x= 0. 


According to Theorem 7, however, the graph of f’ can never exhibit a discon- 
tinuity of the type shown in Figure 22. Problem 39 outlines the proof of another 
beautiful theorem which gives further information about the function /’. 

The next theorem, a generalization of the Mean Value Theorem, is of 
interest mainly because of its applications. 


If f and g are continuous on [a, δ] and differentiable on (a, δ), then there is a 


number x in (a, δ) such that 


[f(b) — fla)]g’(x) = [g(6) — g@) If). 


PROOF 


THEOREM 9 (L’HOPITAL’S RULE) 


- Significance of the Derivative 179 


(if g(b) # g(a), and g’(x) # 0, this equation can be written 


f(b) -- fla) _ f(x). 
g(6) — g(a) ς'΄(.) 
Notice that if g(x) = x for all x, then g’(x) = 1, and we obtain the Mean 


Value Theorem. On the other hand, applying the Mean Value Theorem to f 
and g separately, we find that there are x and y in (a, 6) with 


fo) — fla) ζῶ. 

g(b) — κ() a’(y)’ 
but there is no guarantee that the x and y found in this way will be equal. 
These remarks may suggest that the Cauchy Mean Value Theorem will be | 
quite difficult to prove, but actually the simplest of tricks suffices. ) 


Let 
A(x) = f(x)[g(6) — g(a] — g@)[f) — f@I. 


Then ἀ is continuous on [a, 6], differentiable on (a, δ), and 
h(a) = fla)g(b) — gla) f(o) = Ald). 


It follows from Rolle’s Theorem that h’(x) = 0 for some x in (a, δ), which 
means that 


0 = f’(x)[g(o) — g@] -- 6΄ ΑἹ 100) -- (Ω)}.} 
The Cauchy Mean Value Theorem is the basic tool needed to prove a 


theorem which facilitates evaluation of limits of the form 


fa, 
za g(x) 


when 
lim f(x) = 0 and lim g(x) = 0. 
rw~a ra 


In this case, Theorem 5-3 is of no use. Every derivative is a limit of this form, 
and computing derivatives frequently requires a great deal of work. If some 
derivatives are known, however, many limits of this form can now be evalu- 
ated easily. 


Suppose that 
lim f(x) = 0 and lim g(x) = 0, 


and suppose also that lim f’(x)/g’(x) exists. Then lim f(x) /g(x) exists, and 
za za 


nf) = tm £0 
ie) ee) 


(Notice that Theorem 7 is a special case.) 


180 Derivatives and Integrals 


PROOF 


The hypothesis that lim /’(x)/g’(x) exists contains two implicit assumptions: 


(1) there is an interval (a2 — ὃ, a + 6) such that f’(x) and g’(x) exist for 
all x in (a — ὃ, a + δ) except, perhaps, for x = a, 

(2) in this interval g’(x) σέ 0 with, once again, the possible exception of 
x =a. 


On the other hand, f and g are not even assumed to be defined at a. If we 
define f(a) = g(a) = 0 (changing the previous values of f(a) and g(a), if 
necessary), then f and g are continuous at a. Ifa < « < a + 6, then the Mean 
Value Theorem and the Cauchy Mean Value Theorem apply to f and g on 
the interval [a, x] (and a similar statement holds for a — 6 < x < a). First 
applying the Mean Value Theorem to g, we see that g(x) = 0, for if g(x) = 0 
there would be some x; in (a, x) with g’(x:) = 0, contradicting (2). Now 
applying the Cauchy Mean Value Theorem to f and g, we see that there is a 
number a, in (a, x) such that 


[7(.) — Ο] σ΄ (α.) = [g(x) — O]f’(ae) 


f(x) = 7 (Gz) 
g(x) _g"(az) 


Now a@, approaches a as x approaches a, because ἂς is in (a,x); since 
lim f’(y)/g'(y) exists, it follows that 
ya 


lim 7.9 Ξ- lim 7 (2) = 5: ἢ 9), 
aoe g(x) a σ΄ (ας) ya σ΄ Ὁ) 


ΟΥ 


(Once again, the reader is invited to supply the details of this part of the 
argument.) ἢ | 


PROBLEMS 


1. For each of the following functions, find the maximum and minimum 
values on the indicated intervals, by finding the points in the interval 
where the derivative is 0, and comparing the values at these points with 
the values at the end points. 


() f(x) = «8 — x? — Bx + 1 on [—2, 2]. 


Gi) f(xy) =+xr41 on [—1, 1]. 
(ili) f(x) = 3x4 — 8x3 + 6x? on [—4, 4]. 
iv) 70) = Son [4,11 
@) fe) == on [—1, $1, 
(vi) f(x) = : on [0, 5]. 


x? — 1 


Significance of the Derivative 181 


2. Now sketch the graph of each of the functions in Problem 1, and find all 


3. 


4, 
(1, 6) 
(0, α) 5. 
τ aan (x, 0) 
(0, i a 
FIGURE 23 
6. 


local maximum and minimum points. 


(a) αι < +++ < ay, find the minimum value of f(x) = > (x — a;)?. 
i=1 


*(b) Now find the minimum value of f(x) = ᾿ |x — α,. This is a prob- 
‘=1 


lem where calculus won’t help at all: on the intervals between the 
a;’s the function f is linear, so that the minimum clearly occurs at one 
of the a;, and these are precisely the points where f is not differ- 
entiable. However, the answer is easy to find if you consider how 
f(x) changes as you pass from one such interval to another. 

*(c) Let a > 0. Show that the maximum value of 


1 1 
1+ |x| 1 χα -- αἱ 


is (2 + α)7({ + a). (The derivative can be found on each of the 
intervals (— ©, 0), (0, a), and (a, ©) separately.) 


f(x) = τὰ 


For each of the following functions, find all local maximum and mini- 
mum points. 


Ke. ee Sy De 9 
5, x = 3 
fC = 9 3.395 
9, x=7 
7, x= 9. 
0, x irrational 


Gi) χω) = | 


1/q, χ = p/q in lowest terms. 


as x, x rational 

a) [> | 0, x irrational. 

x = 1/n for some nin N 

otherwise. 

if the decimal expansion of x contains a 5 


otherwise. 


fv) #08) = {Ὁ 
w) fo) = {> 


A straight line is drawn from the point (0, a) to the horizontal axis, and 
then back to (1, 6), asin Figure 23. Prove that the total length is shortest 
when the angles a and β are equal. (Naturally you must bring a function 
into the picture: express the length in terms of x, where (x, 0) is the point 
on the horizontal axis. The dashed line in Figure 23 suggests an alterna- 
tive geometric proof; in either case the problem can be solved without 
actually finding the point (x, 0).) 


Prove that of all rectangles with given perimeter, the square has the 
greatest area. 


182 Derivatives 


Surface area is the 
sum of these areas 


FIGURE 24 


and Integrals 


*9. 


*10. 


11. 


1, 


15. 


Find, among all right circular cylinders of fixed volume V, the one with 
smallest surface area (counting the areas of the faces at top and bottom, 
as in Figure 24). 

Figure 25 shows the graph of the derivative of f. Find all local maximum 
and minimum points of f. 


FIGURE 25 


Suppose that f is a polynomial function, f(x) = x" + a,x" τ’ 

+ ao, with critical points —1, 1, 2, 3, 4, and corresponding critical values 

6, 1, 2, 4, 3. Sketch the graph of f, distinguishing the cases n even and ἡ 

odd. 

(a) Suppose that the polynomial function f(x) = x” + α,. 1χΧῦ + 
- ++ + a) has critical points —1, 1,2, 3,and/f’’(—1) = 0,f’(1) > 0, 
f''(2) < 0, f’’(3) = 0. Sketch the graph of f as accurately as possible 
on the basis of this information. 

(b) Does there exist a polynomial function with the above properties, 
except that 3 is not a critical point? 

Describe the graph of a rational function (in very general terms, similar 

to the text’s description of the graph of a polynomial function). 

(a) Prove that two polynomial functions of degree m and n, respectively, 
intersect in at most max(m, n) points. 

(b) For each m and π exhibit two polynomial functions of degree πὶ and n 
which intersect max(m, n) times. 

(a) Suppose that the polynomial function f(x) = x” + ἀ,. τὴ + 
- + + + ay has exactly & critical points and f’’(x) σέ 0 for all critical 
points x. Show that n — ἀ is odd. 

(b) For each n, show that there is a polynomial function f of degree n 
with & critical points if n — k 18 odd. 

(c) ἜΠΕΙΣΕ that the polynomial function f(x) = x" + a,x" * + 

- + a) has ἦι local maximum points and ἄς locai minimum 
points. Show that kz = k, + 1 ifn is even, and kz = ἀι if n is odd. 

(4) Let n, ki, ko be three integers with k, = k; + 1 if n is even, and 
ky == kyifnis odd, and k; + ko < n. Show that there is a polynomial 
function f of degree n, with k; local maximum points and ky local 


minimum points. Hint: Pick a1 <a,< : -: < dg4x, and try 
kit+ke 


f(x) = [] @ — a)- (1 + x)! for an appropriate number /. 


t=1 


FIGURE 26 


14, 


15. 


16. 


17. 


18. 


Significance of the Derivative 183 


(a) Prove that if f’(x) > M for all x in [a, 6], then f(b) > fla) + 
M(b — a). 

(b) Prove that if f’(x) < m for all x in [a, δ], then f(b) < fla) + m(b — a). 

(c) Formulate a similar theorem when |f(x)| < M for all x in [a, b}. 


(a) Suppose that /’(x) > g’(x) for all x, and that f(a) = g(a). Show that 
f(x) > g(x) for x > a and f(x) < g(x) for x <a. 

(b) Show by an example that these conclusions do not follow without 
the hypothesis f(a) = g(a). 


Find all functions f such that 


(a) f’(x) = sin x. 
(Ὁ) f’(x) = x3. 
(Og Oe) Sar, 


Although it is true that a weight dropped from rest will fall s(¢) = 1622 
feet after ¢ seconds, this experimental fact does not mention the behavior 
of weights which are thrown upwards or downwards. On the other hand, 
the law σ΄ (Ὁ = 32 is always true and has just enough ambiguity to 
account for the behavior of a weight released from any height, with any 
initial velocity. For simplicity let us agree to measure heights upwards 
from ground level; in this case velocities are positive for rising bodies and 
negative for falling bodies, and all bodies fall according to the law 
Se 852) 


(a) Show that s is of the form s(t) = —162 + at + β. 

(b) By setting ¢ = 0 in the formula for s, and then in the formula for Oe 
show that s(t) = —16#? + vot + 50, where so is the height from which 
the body is released at time 0, and 2 is the velocity with which it is 
released. 

(c) A weight is thrown upwards with velocity v feet per second, at 
ground level. How high will it go? (‘“How high”? means “what is the 
maximum height for all times.’’) What is its velocity at the moment 
it achieves its greatest height? What is its acceleration at that 
moment? When will it hit the ground again? What will its velocity 
be when it hits the ground again? 


A cannon ball is shot from the ground with velocity v at an angle ἃ 
(Figure 26) so that it has a vertical component of velocity v sin a and a 
horizontal component v cos @. Its distance s(#) above the ground obeys 
the law s(¢) = —16t? + (v sin a)t, while its horizontal velocity remains 
constantly uv cos a. 


(a) Show that the path of the cannon ball isa parabola (find the position 
at each time ¢/, and show that these points lie on a parabola). 

(b) Find the angle @ which will maximize the horizontal distance 
traveled by the cannon ball before striking the ground. 


184 Derivatives and Integrals 


*19. 


20. 


21. 


22. 


23. 


24. 


*25. 


26. 


7; 


(a) Give an example of a function f for which lim f(x) exists, but 
lim f’(x) does not exist. 


as 
(Ὁ) Prove that if lim f(x) and lim f’(x) both exist, then lim f(x) = 0. 


ae 


(c) Prove that if lim f(x) exists and lim f’’(x) exists, then lim f’(x) = 0 
and also lim /f’(x) = 0. 


(d) Generalize this result. 
Suppose that f and g are two differentiable functions which satisfy 
fe’ — [36 = 0. Prove that if a and ὁ are adjacent zeros of f, and g(a) and 
g(b) are not both 0, then g(x) = 0 for some x between a and ὅ. (Natu- 
rally the same result holds with f and g interchanged; thus, the zeros of 
f and g separate each other.) Hint: Derive a contradiction from the 
assumption that g(x) γέ 0 for all x between a and ὁ: if a number is not 0, 
there is a natural thing to do with it. 
Suppose that |/(x) — f(y)| < ἃ — y)” for n > 1. Prove that f is con- 
stant by considering ζ΄. Compare with Problem 3-20. 
Prove that if 

ao an 


ay = 
ΓΤ ἐδ πὶ 0, 


then 
do tax + hes + a,x" = 0 


for some x in [0, 1]. 

Prove that the polynomial function f(x) = x? — 3x + m never has two 
roots in [0, 1], no matter what m may be. (This is an easy consequence of 
Rolle’s Theorem. It is instructive, after giving an analytic proof, to graph 
fo and fi, and consider where the graph of f,, lies in relation to them.) 
Suppose that f is continuous and differentiable on [0, 1], that f(x) is in 
(0, 1] for each x, and that f’(x) # 1 for all x in [0, 1]. Show that there is 
exactly one number x in [0, 1] such that f(x) = x. (Half of this problem 
has been done already, in Problem 7-11.) 

Prove that if fis a twice differentiable function with f(0) = Oand fl) =1 
and f’(0) = f’(1) = 0, then [f’’(x)| = 4 for some x in (0, 1]. In more 
picturesque terms: A particle which travels a unit distance in a unit time, 
and starts and ends with velocity 0, has at some time an acceleration 2 4. 
Hint: Prove that either f(x) > 4 for some x in [0, 4], or else f’’(x) < 
—4 for some «x in [3, 1]. 

Suppose that f is a function such that f’(x) = 1/x for all x > 0 and 
f(1) = 0. Prove that f(xy) = f(x) + f(y) for all x, y > 0. Hint: Find 
g’(x) when g(x) = f(x). 

Suppose that / satisfies 


Γ΄ ἡ + f'(@)g@) — f(x) = 9 


28. 


29. 


30. 


51. 


ge 


33. 


34, 


Significance of the Derivative 185 


for some function g. Prove that if f is 0 at two points, then f/ is 0 on the 
interval between them. Hint: Use Theorem 6. 


Suppose that f is n-times differentiable and that f(x) = 0 forn+1 
different x. Prove that f(x) = 0 for some x. 


Prove that 
4<V66—-8 <4 
(without computing V 66 to 2 decimal places!). 


Prove the following slight generalization of the Mean Value Theorem: 
If f is continuous and differentiable on (a, δ) and ae f(y) and aS 70) 


exist, then there is some x in (a, 6) such that 


e, fly) - ὑπὸ 79) 


f(s) = ee 


(Your proof should begin: “This is a trivial consequence of the Mean 
Value Theorem because .. . .”’) 


Prove that the conclusion of the Cauchy Mean τς Theorem can be 
written in the form 


f(b) -- fla) _ f(x) 

g(b) — gla) τ) 
under the additional assumptions that g(b) ¥ g(a) and that f’(x) and 
g’(x) are never simultaneously 0 on (a, 6). 


Prove that if f and g are continuous on [a, 6] and differentiable on (a, ὁ), 
and g(x) γέ 0 for x in (a, ὁ), then there is some x in (a, δ) with 


Κῶ. fle) — fla) 
g(x) ε() — gt) 
Hint: Multiply out first, to see what this really Says. 


What is wrong with the following use of ’Hépital’s Rule: 


. wet x — 2 . 3x? + I . 6% 
lim ———————— = lim = lim — = 3. 
zo1x? — 3x + 2 271 2x — 3 a ἃ 


(The limit is actually —4.) 


Find the following limits: 


(1) lim . 
x0 tan x 

cos? x — 1 

ie Coa e 


(1) lim 
z->0 x 


186 Derivatives and Integrals 


35. 


36. 


37. 


Find f’(0) if 


f(x) Ν (i x ~ 0 
0, x = 0, 


and g(0) = g’(0) = 0 and g’’(0) = 17. 
Prove the following forms of l Hépital’s Rule (none requiring any essen- 
tially new reasoning). 


(a) If lim f(x) = lim g(x) = 0, and lim f’(x)/g’(x) = /, then lim 


f(x)/g(x) = 1 (and similarly for limits from below). 
(b) If lim f(x) = lim g(x) = 0, and lim f’(x)/g’(x) = ©, then lim 


f(x)/g(x) = © (and similarly for — ©, or if x > ἃ is replaced by 


χ- α΄ οὐ χα -- α΄). 
(c) If lim f(x) = lim g(x) = 0 and lim f’(*)/e’(x) Ξξ ἰ, then lim 


f(x)/g(x) =7 (and similarly for —«). Hint: Consider lim 


z—+Ot 
f(1/x)/g(1/x). 
(4) If lim f(x) = lim g(x) = 0 and lim f’(x)/g’(x) = ©, then lim 


flx)/g(x) = ο. = 


There is another form of Hd6pital’s Rule which requires more than 


algebraic manipulations: If lim f(x) = lim g(x) = ©, and lim 


f'(x)/g'(x) = 1, then lim f(x)/g(x) = 1. Prove this as follows. 


(a) For every ¢ > 0 there is a number a such that 


Ve 
ΓΘ -- ἰ|΄« ξε forx >a. 
8΄) 
Apply the Cauchy Mean Value Theorem to f and g on [a, x] to show 
that 
| ae 
away neal ce forx >a. 
g(x) — g(a) 


(Why can we assume g(x) — g(a) ¥ 0 ?) 
(b) Now write 


f(x) _ fe) -| Ὁ fl) g(x) — g(a) 


--.. ia OSCE 


g(x) g(x) — g(a) f(x) — fla) g(x) 


(why can we assume that f(x) — f(a) # 0 for large x?) and conclude 
that 


| fe) eats | « 2e for sufficiently large x. 
95) 


38. 


29. 


40. 


Ἐ41. 


42, 


**43, 


*44, 


45. 


Srgnificance of the Derivative 187 


To complete the orgy of variations on l’H6pital’s Rule, use Problem 37 
to prove a few more cases of the following general statement (there are so 
many possibilities that you should select just a few, if any, that interest 
you): 
If lim f(x) = lim g(x) = { }and lim {αὐ σ΄ (ΑἹ) = ( ), then lim 
z>[ ] z>[ ] z[ ] z~[ ] 
f(x)/g(x) = ( ). Here[ ]can be aorat ora~ or © or — and. {iv} 
can be 0 or © or ---Ἢο, and ( )can be/or © or —o. 


(a) Suppose that f is differentiable on [a, 6]. Prove that if the minimum 
of f on ἴα, 4] is at a, then f’(a2) > 0, and if it is at ὁ, then (ὦ) < 0. 
(One half of the proof of Theorem 1 will go through. ) 

(b) Suppose that f’(2) < 0 and f’(b) > 0. Show that f’(x) = 0 for some x 
in (a, δ). Hint: Consider the minimum of f on [α, 67; why must it be 
somewhere in (a, 6)? 

(c) Prove that if f’(a) « ς < f’(b), then f(x) = c for some x in (a, ὁ). 
(This result is known as Darboux’s Theorem.) Hint: Cook up an 
appropriate function to which part (b) may be applied. 


It is easy to find a function f such that [6] is differentiable but f is not. 
For example, we can choose f(x) = 1 for x rational and f(x) = —1 for 
x irrational. In this example f is not even continuous, nor is this a mere 
coincidence: Prove that if | f! is differentiable at a, and f is continuous at 
a, then fis also differentiable at a. Hint: It suffices to consider only ἃ with 
f(a) = 0. Why? In this case, what must | f|’(a) be? 


(a) Let y # 0 and let n be even. Prove that x” + γ = (x + y)” only 
when x = 0. Hint: If xo” + y” = (x) + y)", apply Rolle’s Theorem 
to f(x) = x" + γῆ — (x ai) δὲ [8]: 

(0) Prove that if y ¥ 0 and n is odd, then x” + y" = (« + y)” only if 
x= θοῦ = —y, 


Use the method of Problem 41 to prove that if n is even and f(x) = x", 
then every tangent line to f intersects f only once. 


Prove even more generally that if f’ is increasing, then every tangent line 
intersects f only once. 


Suppose that /(0) = 0 and Κ΄ is increasing. Prove that the function g(x) = 
f(x) /x is increasing on (0, %). Hint: Obviously you should look at 2’(x). 
Prove that it is positive by applying the Mean Value Theorem to fon 
the right interval (it will help to remember that the hypothesis f(0) = ὁ 
is essential, as shown by the function f(x) = 1 + x2). 


Use derivatives to prove that if n > 1, then 
(1+x)">1-+nxe for —1 <x <Oand0 <x. 


(notice that equality holds for x = 0). 


188 Derivatives and Integrals 


FIGURE 27 


46. 


*47. 


**48, 


Let f(x) = x‘ sin? 1/x for x # 0, and let f(0) = 0 (Figure 27) 


(a) Prove that 0 is a local minimum point for f. 
(b) Prove that f’(0) = f’’(0) = 0. 


This function thus provides another example to show that Theorem 6 

cannot be improved. It also illustrates a subtlety about maxima and 

minima that often goes unnoticed: a function may not be increasing in 

any interval to the right of a local minimum point, nor decreasing in any 

interval to the left. 

(a) Prove that if f’(a@) > 0 and f’ is continuous at a, then f is increasing 
in some interval containing a. 


The next two parts of this problem show that continuity of f’ is essential. 


(b) If g(x) = x? sin 1/x, show that there are numbers x arbitrarily close 
to 0 with g’(x) = 1 and also with g’(x) = —1. 

(c) Suppose 0 < a < 1. Let f(x) = ax + x* sin 1/x for x # 0, and let 
ΚΟ) = 0 (see Figure 28). Show that f is not increasing in any open 
interval containing 0, by showing that in any interval there are 
points x with f’(x) > 0 and also points x with f(x) < 90. 


The behavior of f for a > 1, which is much more difficult to analyze, is 
discussed in the next problem. 


One [2 + x?sin 4 re DO 
0, x = 0 


FIGURE 28 


Let f(x) = ax + x? sin 1/x for x τέ 0, and let ΚΟ) = 0. In order to find 
the sign of f(x) when a > 1 it is necessary to decide if 2x sin 1/x — cos 
1/x is < —1 for any numbers « close to 0. It is a little more convenient 
to consider the function g(y) = 2(sin y)/y — cos y for y # 0; we want to 
know if g(y) < —1 for large y. This question is quite delicate; the most 


ἘΈ40, 


Significance of the Derivative 189 


significant part of g(y) is — cos y, which does reach the value —1, but 


this 


happens only when sin y = 0, and it is not at all clear whether g 


itself can have values < —1. The obvious approach to this problem is to 
find the local minimum values of g. Unfortunately, it is impossible to 
solve the equation g’(y) = 0 explicitly, so more ingenuity is required. 


(a) 


(b) 


(c) 
(d) 


Show that if 2’(y) = 0, then 


ΜΡ) 
cos y = siny Σ ; 
J 


and conclude that 


2 2 
40) = κα γί >) 
‘Y 


Now show that if g’(y) = 0, then 


4 2 
sin? y = το ἘΒΕ 
4+ γί 
and conclude that 
2 Βα" 
lg) | = SS: 
V4 + y! 


Using the fact that (2 + y)/W4 + yt > 1, show thatif@ = 1, then 


7 is not increasing in any interval around 0. 


Using the fact that lim (2 + y)/V4 + y* = 1, show thatifa@> 1, 
y ὦ 


then f zs increasing in some interval around 0. 


A function / is increasing at a if there issome number 6 > 0 such that 


and 


f(x) > fla) if a<x<at+6 


f(x) <fla) if «-- ὃ <x <a. 


Notice that this does not mean that f is increasing in the interval (a — ὃ, 


at 
but 


(a) 


(b) 


(c) 


6); for example, the function shown in Figure 28 is increasing at 0, 
is not an increasing function in any open interval containing 0. 


Suppose that f is continuous on [0, 1] and that f is increasing at a for 
every a in [0, 1]. Prove that fis increasing on [0, 1]. (First convince 
yourself that there is something to be proved.) Hint: For 0 < ὁ < 1, 
prove that the minimum of f on [4, 1] must be at ὁ. 

Prove part (a) without the assumption that f is continuous, by con- 
sidering for each ὁ in [0, 1] the set δὲ = {x: f(y) > (0) for all y in 
[ὁ, x]}. (This part of the problem is not necessary for the other parts.) 
Hint: Prove that S, = {x: ὦ < « < 1} by considering sup δ. 

If f is increasing at a and f is differentiable at a, prove that f(a) > 0 
(this is easy). 


190 Derivatives and Integrals 


**50. 


lag Ὁ 


(4) If f’(a) > 0, prove that f is increasing at a (go right back to the defi- 
nition of f’(a)). 

(e) Use parts (a) and (d) to show, without using the Mean Value 
Theorem, that if f is continuous on [0, 1] and f’(a) > 0 for all α in 
[0, 1], then f is increasing on [0, 1]. 

(Ὁ Suppose that fis continuous on [0, 1] and f’(a) = 0 for all ain (0, 1). 
Apply part (e) to the function g(x) = f(x) + €* to show that 
f(1) — f(0) > —e. Similarly, show that f(1) — f(0) < € by con- 
sidering A(x) = εχ — f(x). Conclude that f(0) = f(1). 


This particular proof that a function with zero derivative must be con- 
stant has many points in common with a proof of H. A. Schwartz, which 
may be the first rigorous proof ever given. Its discoverer, at least, seemed 
to think it was. See his ex-rberant letter in reference [41] of the Suggested 
Reading. 


If f is a constant function, then every point is a local maximum point 
for f. It is quite possible for this to happen even if f is not a constant func- 
tion: for example, if f(x) = 0 for x < 0 and f(x) = 1 for x > 0. Prove, 
however, that if f is continuous on [a, ὁ] and every point of [a, ὁ] is a 
local maximum point, then f is a constant function. Hint: If the mini- 
mum of f occurs at x9, consider the set {x in [xo, 5]: f(y) = f(*o) for all 
y in [xo, x}. 

(a) A point x is called a strict maximum point for f on A if 
f(x) > f(y) for all y in A with y # x (compare with the definition of 
an ordinary maximum point). A local strict maximum point is 
defined in the obvious way. Find all local strict maximum points of 
the function 


0, «x irrational 
ΤΑΣ Ξ 2 


1 ᾿ 

“Ὁ x Ξε - ἴὴ lowest terms. 

q 4 
It seems quite unlikely that a function can have a local strict maxi- 
mum at every point (although the above example might give one 
pause for thought). Prove this as follows. 


(Ὁ) Suppose that every point is a local strict maximum point for f. Let 
x; be any number and choose a; < x; < δὶ with δι — αι < 1 such 
that f(x:) > f(x) for all x in [a, δι]. Let x2 4 x, be any point in 
(a1, ὁ.) and choose a, < a, < x2 < ὃς “δι with δὲ — a, « ἃ such 
that f(x.) > f(x) for all x in [a2, δ2]. Continue in this way, and 
use the Nested Interval Theorem (Problem 8-14) to obtain a 
contradiction. 


FIGURE 1 


Significance of the Derivative 191 


APPENDIX. CONVEXITY AND CONCAVITY 


Although the graph of a function can be sketched quite accurately on the 
basis of the information provided by the derivative, some subtle aspects of the 
graph are revealed only by examining the second derivative. These details 
were purposely omitted previously because graph sketching is complicated 
enough without worrying about them, and the additional information ob- 
tained is often not worth the effort. Also, correct proofs of the relevant facts 
are sufficiently difficult to be placed in an appendix. Despite these discour- 
aging remarks, the information presented here is well worth assimilating, 
because the notions of convexity and concavity are far more important than 
as mere aids to graph sketching. Moreover, the proofs have a pleasantly 
geometric flavor not often found in calculus theorems. Indeed, the basic 
definition is geometric in nature (see Figure 1). 


DEFINITION 1 A function f is convex on an interval, if for all a and ὦ in the interval, the 


line segment joining (a, f(a)) and (ὁ, f(b)) lies above the graph of f. 


The geometric condition appearing in this definition can be expressed in an 
analytic way that is sometimes more useful in proofs. The straight line between 
(a, f(a@)) and (ὁ, f(6)) is the graph of the function g defined by 


f i ἐπ 5 ΠΘ 


g(x) = (x — a) + f(a). 


This line lies above the graph of f at x if g(x) > f(x), that is, if 


[Ὁ — fe) -ῖὸ Oy 9) + fla) > fo) 
or 

LODO ( — a) > fle) flo 
or 


fl) = Κῶ. fx) — Κῶ, 


b—a x—a 


We therefore have an equivalent definition of convexity. 


DEFINITION 2 A function f is convex on an interval if for a, x, and ὁ in the interval with 
a<x«< b we have 


fle) - Κῶ, flo) -- fla) 


x—a b—a 


If the word ‘‘over” in Definition 1 is replaced by “‘under”’ or, equivalently, 


192 Derivatives and Integrals 


FIGURE 2 


THEOREM 1 


PROOF 


if the inequality in Definition 2 is replaced by 


fix) = Κα. Κῶ) = Κῶ 

NOS ἃ ἌΝ 
we obtain the definition of a concave function (Figure 2). It is not hard to see 
that the concave functions are precisely the ones of the form —/, where f is 
convex. For this reason, the next three theorems about convex functions have 
immediate corollaries about concave functions, so simple that we will not even 


bother to state them. 
Figure 3 shows some tangent lines of a convex function. Two things seem to 


be true: 

(1) The graph of f lies above the tangent line at (a, f(a)) except at the 
point (a, f(a)) itself (this point is called the point of contact of the 
tangent line). 

(2) Ifa < ὁ, then the slope of the tangent line at (a, f(a)) is less than the 
slope of the tangent line at (4, f(4)); that is, f’ is increasing. 


As a matter of fact these observations are true, and the proofs are not difficult. 


FIGURE 3 


Let f be convex. If f is differentiable at a, then the graph of f lies above the 
tangent line through (a, f(a)), except at (a, f(a)) itself. Ifa < ὁ and f is differ- 
entiable at a and ὁ, then f(a) < f’(d). 


If 0 < hi < Az, then, as Figure 4 indicates, 
fla + hi) — f(a) - f(a + he) -- fla). 
hy he 
A nonpictorial proof can be derived immediately from Definition 2 applied to 
a<ath, <a+t hz. Inequality (1) shows that the values of 


fla + h) — fla) 
h 


(1) 


Szenificance of the Derivative 193 


a athat+h, 


FIGURE 4 


decrease as h > ΟἿ. Consequently, 


fla) < fet ἢ oie for h > 0 
(in fact f’(a) is the greatest lower bound of all these numbers). But this means 
that for h > Ὁ the secant line through (a, f(a)) and (a + A, f(a + A)) has 
larger slope than the tangent line, which implies that (a + A, f(a + h)) lies 
above the tangent line (an analytic translation of this argument is easily 
supplied). — 

For negative / there is a similar situation (Figure 5): if he < hi < 0, then 


fla +m) — fla) , fla + ha) — fla) 
hy he 
This shows that the slope of the tangent line is greater than 
fla + h) — fla) 
h 


(in fact f’(a) is the least upper bound of all these numbers), so that f(a + h) 
lies above the tangent line if A < 0. This proves the first part of the theorem. 


forh < 0 


ath. a+ hy, a 


FIGURE 5 


194 Derivatives and Integrals 


FIGURE 6 


Now suppose that a < ὁ. Then, as we have already seen (Figure 6), 


fla + (b — a)) ~ fla) 


f(a) < ae since ὁ — a> 0 
_ 10) — Κα) 
ὖ --α 


and 


fee LO anes he u 


f(b) > 


- ὁ 
_ Κα) -- ft) fo Oo = Κα 
a= B72 
aa a | Combining these inequalities, we obtain f’(2) < f’(d). ἢ 
FIGURE 7 Theorem 1 has two converses. Here the proofs will be a little more difficult. 


We begin with a lemma that plays the same role in the next theorem that 
Rolle’s Theorem plays in the proof of the Mean Value Theorem. It states that 
if f’ is increasing, then the graph of f lies below any secant line which happens to 
be horizontal. 


LEMMA = Suppose / is differentiable and /’ is increasing. If a < ὁ and f(a) = f(b), then 
f(x) < f(a) = f(b) fora<x« <b. 


PROOF Suppose first that f(x) > f(a) = f(6) for some χ in (a, 6). Then the maximum 
of f on [a, δ) occurs at some point x9 in (a, 6) with f(xo) > f(a) and, of course, 
f'(xo) = 0 (Figure 7). On the other hand, applying the Mean Value Theorem 
to the interval [a, xo], we find that there is x; with a < x; < x9 and 

film) = f(%o) ως 0, 
—a 


Χο 


FIGURE 8 


FIGURE 9 


(2, Κλ) 2 


THEOREM 2 


PROOF 


THEOREM 3 


PROOF 


Szgnificance of the Derivative 195 


contradicting the fact that f’ is increasing. This proves that f(x) < f(a) = f(d) 
for a < x < ὦ, and it only remains to prove that ] (x) = f(a) is also impossible 
for x in (a, ὁ). 

Suppose f(x) = f(a) for some x in (a, ὁ). We know that f is not constant on 
[a, x] (if it were, f’ would not be increasing on [a, x]) so there is (Figure 8) 
some x; witha < x1 < xandf(x:) < f(a). Applying the Mean Value Theorem 
to [x1, x] we conclude that there is x. with x. < x2. < x and 


ΠΕΣ 


Nm XY 


f'(%2) = 


On the other hand, f’(x) = 0, since a local maximum occurs at x. Again this 
contradicts the hypothesis that /’ is increasing. J 


We now attack the general case by the same sort of algebraic machinations 
that we used in the proof of the Mean Value ‘Theorem. 


If f is differentiable and /’ is increasing, then f is convex. 


Let a < ὁ. Define g by 


g(x) = f(x) - (x — a). 


ΜΡ Ξ ΠΟ 
ὦ 

It is easy to see that g’ is also increasing; moreover, g(a) = g(b) = f(a). Apply- 

ing the lemma to g we conclude that 


g(x) <f(a) if α«χ «ὁ. 
In other words, if a < x < 6, then 


fox) — FO SLO — a) < fla 


or 
fla) — fla) ΚΟ) = fla), 
c—2 b-a 


Hence f is convex. ἢ 


If f is differentiable and the graph of f lies above each tangent line except at 
the point of contact, then f is convex. 


Let a < ὁ. It is clear from Figure 9 that if (ὁ, f(2)) lies above the tangent line 
at (a, f(a)), and (a, f(a)) lies above the tangent line at (ὁ, f(6)), then the slope 
of the tangent line at (4, f(5)) must be larger than the slope of the tangent line 
at (a, f(a)). The following argument just says this with equations. 

Since the tangent line at (a, f(a)) is the graph of the function 


g(x) = f'(a)(x — a) + fla); 


196 Derivatives and Integrals 


and since (ὦ, f(6)) lies above the tangent line, we have 
(1) f(b) > f'(a)(o — a) + f(a). 
Similarly, since the tangent line at (4, f(4)) is the graph of 
A(x) = [γα — 6) + f(d), 
and (a, f(a)) lies above the tangent line at (ὁ, f(5)), we have 
(2) fla) > ΓΚ (γα — δ) + f(a). 
It follows from (1) and (2) that f’(a) < f’(d). 


It now follows from Theorem 2 that f is convex. J 


If a function f has a reasonable second derivative, the information given in 
these theorems can be used to discover the regions in which f is convex or 
concave. Consider, for example, the function 


1 
f(x) = oer 
For this function, 
Te aes: 
πο Ὁ 


Thus f’(x) = 0 only for x = 0, and f(0) = 1, while 
fix) >0 if «<0, 
f(x) <0 if x> 0. 
Moreover, 
f(x) > 0 for all x, 
f(x) 970 asx— ~or—-o, 
f is even. 


FIGURE 10 


The graph of f therefore looks something like Figure 10. We now compute 
ral ) (1 + x7)?(—2) + 2x ° [2(1 + x?) 5 2x | 
x) = — ee -- --.--------- 
(1 + x?) 
Pee Ἐν 
Gla)? 


Significance of the Derivative 197 


It is not hard to determine the sign of f’’(x). Note first that f’’(x) = 0 only 
when x = V1 7508 = V'1/3. Since f’ is clearly continuous, it must keep the 
same sign on each of the sets 


(55> v 1/3), 
(27 173-9 173): 


(V1/3, ©). 

Since we easily compute, for example, that 
f=) a δ > 0, 
f’(0) = -2 < 0, 
fl) = F> 9, 


we conclude that 

a > 0on (-- ὦ, — V1/3) and (V11/3, 00 ), 

“΄ <0 on (— V1/3, V1/3). 
Since f’’ > 0 means [7 is increasing, it follows from ‘Theorem 2 that f is convex 
on (— 0, — V1/3) and (W1/3, ©), while on (— W1/3, V'1/3) f is con- 
cave (Figure 11). 


fis convex f is concave f is convex 


VIB ΝΤ3 
FIGURE 11 


Notice that at (V1 /3, 3) the tangent line lies below the part of the graph 
to the right, since f is convex on (V1 /3, ©), and above the part of the graph 
to the left, since f is concave on (— V1/ 2: νι, 3); thus the tangent line 
crosses the graph. In general, a number a is called an inflection point of f if 
the tangent line to the graph of f at (a, f(a)) crosses the graph; thus V1 /3 and 
~V1 73 are inflection points of f(x) = 1/(1 + x”). Note that the condition 
f(a) = 0 does not ensure that a is an inflection point of ἢ; for example, if 
f(x) = x4, then f’’(0) = 0, but f is convex, so the tangent line at (0, 0) cer- 
tainly doesn’t cross the graph of f. In order for a to be an inflection point of a 
function f, it is necessary that {7} should have different signs to the left and 
right of a. 

This example illustrates the procedure which may be used to analyze any 
function f. After the graph has been sketched, using the information provided 


198 Derivatives and Integrals 


by f’, the zeros of f’’ are computed and the sign of f’’ is determined on the 
intervals between consecutive zeros. On intervals where f’” 0 the function is 
convex; on intervals where f’’ < 0 the function is concave. Knowledge of the 
regions of convexity and concavity of f can often prevent absurd misinter- 
pretation of other data about f. Several functions, which can be analyzed in 
this way, are given in the problems, which also contain some theoretical 
questions. 

There is one further fact which I cannot resist mentioning here. We have 
seen that convex and concave functions have the property that every tangent 
line intersects the graph just once, and a few drawings will probably convince 
you that no other functions have this property. This is indeed the case, but the 
only proof I know contains so many delicate points that I feel it is wiser to 
omit 1t. 


PROBLEMS 


1. Sketch, indicating regions of convexity and concavity and points of 
inflection, the functions in Problem 11-1 (consider (iv) as double 
starred). 

2. Figure 25 in Chapter 11 shows the graph of ζ΄. Sketch the graph οἵ αὶ 

3. Find two convex functions f and g such that f(x) = g(x) if and only if x is 
an integer. 

4. Show that / is convex on an interval if and only if for all x and y in the 
interval we have 


Κα + (1 — thy) < tf(x) + (Δ — df(t), for 0 «ἐ < 1. 


(This is just a restatement of the definition, but a useful one.) 
5. Prove that if f and g are convex and f is increasing, then g ο f is convex. 
*6. Let f be a twice-differentiable function with the following properties: 
f(x) > 0 for x > 0, and f is decreasing, and f’(0) = 0. Prove that 
f(x) = 0 for some x > Ὁ (so that in reasonable cases f will have an 
inflection point at x—-an example is given by f(x) = 1/(1 + x*)). Every 
hypothesis in this theorem is essential, as shown by f(x) = 1 — «?, which 
is not positive for all x; by f(x) = x?, which is not decreasing; and by 
f(x) = 1/(« 4 1), which does not satisfy (0) = 0. Hint: Choose x» > 0 
with Κ (Χο) < 0. We cannot have f’(y) < f’(xo) for all y > xo. Why not? 
So f’(x1) > f’(xo) for some x1 > xo. Consider f’ on [0, xj]. 
*7. (a) Prove that if f is convex, then f([x + y]/2) < [f(x) + f(y)]/2. 
(b) Suppose that f satisfies this condition. Show that f(kx + (1 — Ay) < 
kf(x) + (1 — &)f(y) whenever ἀ is a rational number, between 0 and 
1, of the form m/2”. Hint: Part (a) is the special case n = 1. Use 
induction, employing part (a) at each step. 
(c) Suppose that f satisfies the condition in part (a) and f is continuous. 
Show that f is convex. 


Significance of the Derivative 199 


*8. Let pi, . . . χη be positive numbers with > ρι = 1. 
4 τ 1 


n 
(a) For any numbers x1, . . . , *, show that Σ pix; lies between the 


smallest and the largest *;. 


ἘΠῚ n-1 
(Ὁ) Show the same for (1/2) » pixi, where ἐ = 2 }ι. 


t= 


(c) Prove Jensen’s inequality: If f is convex, then f () Dixi \< < ) Dif (x). 
t=1 
Hint: Use Problem 4, noting that p, = 1 — t. (Part (b) is needed to 
show that (1 7) y pix; is in the domain of f if, . . . , x, are.) 


i=l 
*9. (a) For any function f, the right-hand derivative, int a+ he 


f(a)\/h, is denoted by f,’(a), and the left-hand derivative is denoted 
by f_’(a). The proof of Theorem 1 actually shows that f,’ and f_’ 
always exist if f is convex. Check this assertion, and also show that 
f,’ and f_’ are increasing, and that f_'(a) < f,"(a). 

**(b) Show that if fis convex, then f,’(a) = f_’(a) if and only if f,’ is con- 
tinuous at a. (Thus ἢ is differentiable precisely when f,” is con- 
tinuous.) Hint: [f(d) — f(a)]/(6 — a) is close to f_"(a) for ὁ <a 
close to a, and f,’() is less than this quotient. 

**10. Prove that a convex function must be continuous. 

ἘΠῚ Prove that if f is convex on R, then either f/ is decreasing, or else / is 
increasing, or else there is a number ὦ such that f is decreasing on 
(— οο, c] and increasing on [c, ©). This problem requires a lot of careful 
bookkeeping, and is perhaps best postponed until after the next chapter, 
where a similar theorem is proved in detail. An important step for this 
problem is to establish that if f(a) < f(b) for some a < ὁ, then f is 
increasing on [b, ©), while f is decreasing on (— ©, αἱ if f(a) > f(d). 


CHAPTER 


DEFINITION 


INVERSE FUNCTIONS 


We now have at our disposal quite powerful methods for investigating func- 
tions; what we lack is an adequate supply of functions to which these methods 
may be applied. We have studied various ways of forming new functions from 
old—addition, multiplication, division, and composition—but using these 
alone, we can produce only the rational functions (even the sine function, 
although frequently used for examples, has never been defined). In the next 
few chapters we will begin to construct new functions in quite sophisticated 
ways, but there is one important method which will practically double the 
usefulness of any other method we discover. 

If we recall that a function is a collection of pairs of numbers, we might hit 
upon the bright idea of simply reversing all the pairs. Thus from the function 


f= 412) (354) 5, 9) 13,8) 


g = i(2, 1), (, 3), (9, 5), (8, 13)}. 


While f(1) = 2 and f(3) = 4, we have g(2) = 1 and g(4) = 3. 
Unfortunately, this bright idea does not always work. If 


7 saa {(1, 2): (3, 4), (5, 9), (13, 4)}, 
then the collection 
12, 1) A533) sO: 5)5: (4,13)} 


is not a function at all, since it contains both (4, 3) and (4, 13). It is clear 
where the trouble lies: f(3) = f(13), even though 3 # 13. This is the only 
sort of thing that can go wrong, and it is worthwhile giving a name to the 
functions for which this does not happen. 


A function f is one-one (read ‘‘one-to-one’’) if f(a) # f(b) whenever a ¥ ὁ. | 


The identity function J is obviously one-one, and so is the following modifi- 


cation: 
x; 
g(x) = 4 3, 


5, 


we obtain 


HK 


2.8 
5 


δ δὲ δὲ 


oF 
The function f(x) = x? is not one-one, since f(—1) = f(1), but if we define 
gx) =<, x ἢ 


(and leave g undefined for x < 0), then g is one-one, because g is increasing 


200 


DEFINITION 


THEOREM 1 


PROOF 


Inverse Functions 201 


(since g’(x) = 2x > 0, for x > 0). This observation is easily generalized: If n 


is a natural number and 


then f is one-one. If n is odd, one can do better: the function 
f(x) = x" for all «x 


is one-one (since f’(x) = nx"! > 0, for all x ¥ 0). 

It is particularly easy to decide from the graph of f whether [15 one-one: the 
condition f(a) τέ f(b) for a # ὦ means that no horizontal line intersects the 
graph of f twice (Figure 1). 


a one one function a function which is not one-one 
(a) . ᾿ς (b) 
FIGURE 1 


If we reverse all the pairs in (a not necessarily one-one function) f we obtain, 
in any case, some collection of pairs. It is popular to abstain from this pro- 
cedure unless f is one-one, but there is no particular reason to do so—instead 
of a definition with restrictive conditions we obtain a definition and a theorem. 


For any function f, the inverse of f, denoted by f-*, is the set of all pairs 
(a, 6) for which the pair (ὦ, a) is in ἢ. 


NN a “--ς---ς--ρ-.-.-ς-ς - -.- 


{115 a function if and only if f is one-one. 


Suppose first that f is one-one. Let (a, 6) and (a, c) be two pairs in f~*. Then 
(δ, a) and (c, a) are in f, so a = f(b) and a = f(c); since f is one-one this 
implies that ὁ = c. Thus [Γ᾿ is a function. 

Conversely, suppose that f—! is a function. If (6) = f(c), then f contains the 
pairs (ὁ, f(b)) and (c, f(c)) = ( f(0)), so (f(6), δ) and (f(4), c) are in fl. 


Since f~! is a function this implies that ὁ = c. Thus f is one-one. | 


The graphs of f and [Γ᾿ are so closely related that it is possible to use the 
graph of f to visualize the graph of ΚΓ}. Since the graph of f—' consists of all 
pairs (a, ὁ) with (ὁ, a) in the graph of f, one obtains the graph of f—} from the 


202 Derivatives and Integrals 


(ad, ey 


/ 
/ 


/ diagonal 
C ὦ 
( 3 ) vA 


Ὁ 


FIGURE 4 


graph of f by interchanging the horizontal and vertical axis. If f has the graph 
shown in Figure 2(a), 


Ἢ] 
er 
Ω 
σι 
re) 
es) 
ho 
pond 
= 
~— 

FIGURE 2(a) 

Oo p 

SSeS SS 6a 5 

2n e323 90C 5. δ ion 

topper re “< 

soem orPraes 

BG HO pa ek 5.55 χ' oe 

ON er σι Ὁ 

ar ieee arate ee ee x 9 

a~ PT. Coa tT sew es 

Gq OCS Gg cat awe 

etn FPP Ga ontmae 

[δ J. ΤΕΣ 

Soe ym Oo WE? =. 

: gQ συ 

ΙΝ OES SSS. 

oT Ev Bo Ὁ BB 

SS ἘΞ Ἐπ Fh. 

om 2 Ξ, 5 Sop ὃ 

Seo eas 85 Ss ES 

δ 5 ΘΚ ἃ α “5 s 

οξ pm ogo BOSS 

5S ks Se oY sy 
“ri 
— 
‘a 
‘ed 4 
ὋΣ 
v2) 
ΠῚ 
es) 


This procedure is awkward with books and impossible with blackboards, 
so it is fortunate that there is another way of constructing the graph of f~!. The 
points (a, ὁ) and (4, a) are reflections of each other through the graph of 
I(x) = x, which is called the diagonal (Figure 4). To obtain the graph of f~! 
we merely reflect the graph of f through this line (Figure 5). 

Reflecting through the diagonal twice will clearly leave us right back where 
we started; this means that (f—!)~! = f, which is also clear from the definition. 
In conjunction with Theorem 1, this equation has a significant consequence: 
if f is a one-one function, then the function ΚΓ 1 is also one-one (since (f~!)~? is 
a function). 

There are a few other simple manipulations with inverse functions of which 


FIGURE 5 


FIGURE 6 


FIGURE 7 


LEMMA 1 


PROOF 


Inverse Functions 203 


you should be aware. Since (a, δ) is in f precisely when (ὁ, a) is in f™, it 
follows that 
b = f(a) means the same as a = f ‘(b). 


Thus f71(6) is the (unique) number a such that f(a) = 6; for example, if 
f(x) = x, then f~1(0) is the unique number a such that αὖ = ὁ, and this num- 
ber is, by definition, δ ὁ. 

The fact that f~!(x) is the number y such that f(y) = x can be restated in a 
much more compact form: 


f(f-'(x)) = x, for all x in the domain of 7". 
Moreover, 
f= (f(x)) = x, for all x in the domain of 7; 


this follows from the previous equation upon replacing f by f~*. These two 
important equations can be written 


fofral, 
flofel 
(except that the right side will have a bigger domain if the domain of for [Γ is 
not all of R). 

Since many standard functions will be defined as the inverses of other func- 
tions, it is quite important that we be able to tell which functions are one-one. 
We have already hinted which class of functions are most easily dealt with— 
increasing and decreasing functions are obviously one-one. Moreover, if f is 
increasing, then ΚΠ 115 also increasing, and if fis decreasing, then {15 decreas- 
ing (the proof is left to you). In addition, f is increasing if and only if —f is 
decreasing, a very useful fact to remember. 

It is certainly not true that every one-one function is either increasing ΟΥ̓ 
decreasing. One example has already been mentioned, and is now graphed in 


Figure 6: 
x, x 3,5 
g(x) = 3 go 5 
5. = 9. 


Figure 7 shows that there are even continuous one-one functions which are 
neither increasing nor decreasing. But if you try drawing a few pictures you 
will soon agree that every one-one continuous function defined on an interval 
is either increasing or decreasing. This theorem is a bureaucrat’s delight; the 
proof uses no new ideas, but it does require careful organization. It helps to 
begin with two lemmas, the second being a corollary of the first. 


Let f be a continuous one-one function defined on an interval, and let a, ὦ, 
and ¢ be points of the interval with a <c < ὁ. If f(a) < f(b), then f(a) < 
f(c) < f(b); and if f(a) > f(b), then f(a) > f() > f(b) (as in Figure 8). 


Consider first the case where f(a) < f(b). Since c # a, ὃ we have f(c) # f(a), 


204 Derivatives and Integrals 


FIGURE 9 


- ---ο....ὄ . .. 


Ὁ ---------- -ο-ο 


FIGURE 10 


LEMMA 2 


PROOF 


THEOREM 2 


PROOF 


(a) (b) 
FIGURE 8 


and f(c) # f(b), so 
either f(c) < f(a) 
or f(a) < f(c) < f(a) 
or f(b) < fl), 


and it is only necessary to rule out the first and third possibilities. 

Suppose that f(c) < f(a) were true (Figure 9). Apply the Intermediate 
Value Theorem to the interval [c, 5]: since f(c) < f(a) < f(b) there is some x 
in [c, ὁ] such that f(x) = f(a). But this contradicts the fact that f is one-one, 
since ὦ ~ x. Thus f(c) < f(a) is impossible. 

Similarly, suppose that f(c) > f(b) were true (Figure 10). Apply the Inter- 
mediate Value ‘Theorem to the interval |a, c]: since f(a) < f(b) < f(c) there is 
some x in ἴα, c] such that f(x) = f(b), again contradicting the fact that f is 
one-one. Thus f(c) > /(4) is also false, leaving only the possibility f(a) < 
7.5)» < fC). 

The case where f(a) > f(6) may be handled in precisely the same way. Or 
one can be elegant and apply the first case to —f. J 


Let f be a continuous one-one function defined on an interval, let a and ὁ be 
two points of the interval, and suppose that f(a) < f(b). Ifc¢ is a point of the 
interval with ¢ < a, then f(c) < f(a); and similarly, if ὁ « c, then f(b) < f(c). 
(Analogous statements could be made if f(a) > f(6), but we will not need 
them.) 


Consider the case ¢ < a. Suppose first that f(c) > f(b). Then Lemma 1, 
applied to ¢ < a < ὁ, would imply that f(a) > f(6), which is false. Hence 
f(c) < f(b). But now Lemma 1 may be applied toc < a « ὁ to conclude that 


fo) < fla) < f(d). 
The case ὁ « ¢ is handled similarly. J 


Finally, we are ready for the theorem. 


If f is continuous and one-one on an interval, then f is either increasing or 
decreasing on that interval. 


Let a@ and 4 be two numbers in the interval with a « ὁ. Since f(a) # f(d), 


FIGURE 11 


7) 
10) 


FIGURE 12 


Inverse Functions 205 


either f(a) < f(b) or f(a) > f(b). We will first concentrate on the case f(a) < 
f(b), and show that f is increasing. 

Let c and d be two points in the interval with ὁ < d. Ione (or both) of ¢, d 
equals a or ὦ, then it follows immediately from Lemmas 1 and 2 that f(c) < 
f(d). The other possibility leads to six cases: 


(Λε «ἀς«α «Ὁ fp Ὁ ---- 
d ὁ 

(2), « «α «ἀ <6 ρος τ το 4 

ῦ a b d 


(2» ε «α «ὁ «ἀ ff -- τ ----.-----Ἐ- 


(4)a<c<d<b ++ +--+ 
a C b d 
base «ὁ <d __—________}-_}____—___}+- 
a b ε d 
(6)a<b<cc<d ....»ν.------ τ Ἐς ---Ἔ 


Each of these cases can be treated easily, using the lemmas. For example, in 
case (1) first apply Lemma 2 to ¢ < a < 6 to conclude that f(c) < f(a), and 
then apply Lemma 1 to ὁ « d < a to conclude that f(c) « f(d). The other 
cases are left to you, as an easy exercise. 

If f(a) > f(b), one can prove similarly that f is decreasing. Or one can apply 
the result just proved to —f (clearly the preferable method). ἢ 


Henceforth we shall be concerned almost exclusively with continuous 
increasing or decreasing functions which are defined on an interval. If fis such 
a function, it is possible to say quite precisely what the domain of [Γ᾿ will be 
like. 

Suppose first that fis a continuous increasing function on the closed interval 
[a, ὁ]. Then, by the Intermediate Value Theorem, f takes on every value 
between f(a) and f(b). Therefore, the domain of [1 is the closed interval 
[ f(a), f(b)]. Similarly, if f is continuous and decreasing on ἴα, ὁ], then the 
domain of f~! is [f(4), f(a)]. 

If f is a continuous increasing function on an open interval (a, δ) the analysis 
becomes a bit more difficult. To begin with, let us choose some point ¢ in 
(a, δ). We will first decide which values >/(c) are taken on by f. One possi- 
bility is that f takes on arbitrarily large values (Figure 11). In this case f takes 
onall values >f(c), by the Intermediate Value Theorem. If, on the other hand, 
f does not take on arbitrarily large values, then A = {f(x): ὁ “χα « δ) is 
bounded above, so A has a least upper bound α (Figure 12). Now suppose y is 
any number with f(c) < y < a. Then f takes on some value f(x) > y (because 
a is the least upper bound of A). By the Intermediate Value Theorem, ἢ 
actually takes on the value y. Notice that f cannot take on the value α itself; 
for if a = f(x) for a < x < ὁ and we choose ¢ with x <7 < b, then f(t) > @, 
which is impossible. 


206 Derivatives and Integrals 


THEOREM 3 


PROOF 


FIGURE 13 


Precisely the same arguments work for values less than f(c): either f takes 
on all values less than f(c) or there isa number 8 < f(c) such that f takes on all 
values between β and f(c), but not β itself. 

This entire argument can be repeated if f is decreasing, and if the domain 
of f is R or (a, ©) or (— ©, a). Summarizing: if f is a continuous increasing, 
or decreasing, function whose domain is an interval having one of the forms 


(a, b), (5 os b), (a, 0 ), or R, 


then the domain of f~ is also an interval which has one of these four forms. 

Now that we have completed this preliminary analysis of continuous one- 
one functions, it is possible to begin asking which important properties of a 
one-one function are inherited by its inverse. For continuity there is no 
problem. 


If f is continuous and one-one on an interval, then f~! is also continuous. 


We know by Theorem 2 that f is either increasing or decreasing. We might as 
well assume that f is increasing, since we can then take care of the other case 
by applying the usual trick of considering —f. 

We must show that 


lim (10) = f(b) 


for each ὁ in the domain of f~!. Such a number ὁ is of the form f(a) for some a 
in the domain of f. For any ¢ > 0, we want to find a ὃ > O such that, for all x, 


if f(a) -- ὃ <x < f(a) + ὃ, thna—e< f(x) <ate. 


Figure 13 suggests the way of finding 6 (remember that by looking sideways 
you see the graph of f~'): since 


a-€E€<a<ate, 
it follows that 


fla — &) < f(a) < fla + 8); 


we let ὃ be the smaller of f(a + ¢) — f(a) and f(a) — f(a — ¢). Figure 13 con- 
tains the entire proof that this 6 works, and what follows is simply a verbal 
account of the information contained in this picture. 

Our choice of 6 ensures that 


fla — ε) < f(a) — 6 and f(a) + ὃ < {(« -Ἐ 9). 


Consequently, if 


fla) ~-6<x< f(a) + 6, 
then 


fla—e)<x<f(ates). 


Since f is increasing, f~! is also increasing, and we obtain 


f (fla — &)) < fx) < fF @+ 9), 


Lio 


FIGURE 14 


(fla), @) gcc 


THEOREM 4 


PROOF 


Inverse Functions 207 


1.€., 
a—-é&<fi(x)<atsé, 


which is precisely what we want. ἢ 


Having successfully investigated continuity of f~!, it is only reasonable to 
tackle differentiability. Again, a picture indicates just what result ought to be 
true. Figure 14 shows the graph of a one-one function f with a tangent line L 
through (a, f(a)). If this entire picture is reflected through the diagonal, it 
shows the graph of f—! and the tangent line L’ through (f(a), a). The slope of 
L’ is the reciprocal of the slope of L. In other words, it appears that 


ae 

f@ 

This formula can equally well be written in a way which expresses {7 1). (Ὁ) 
directly, for each ὁ in the domain of f~’: 


(fF )"(fla)) = 


1 
PFA) 
Unlike the argument for continuity, this pictorial ‘‘proof”” becomes some- 


what involved when formulated analytically. There is another approach 
which might be tried. Since we know that 


ΚΓ ΟΣ) = % 


it is tempting to prove the desired formula by applying the Chain Rule: 


iG iC) Se Cie EC) eae 


Ce) = 


sO 


1 
A 
fF) 
Unfortunately, this is not a proof that f~! is differentiable, since the Chain 
Rule cannot be applied unless f~! is already known to be differentiable. But 


this argument does show what (f—!)’(x) will have to be if f~* zs differentiable, 
and it can also be used to obtain some important preliminary information. 


If f is a continuous one-one function defined on an interval and f’(f~+(a)) = 9, 
then (ΠῚ is not differentiable at a. 


We have 
fF UW) = * 


If f—! were differentiable at a, the Chain Rule would imply that 


γ᾿): FY@ = 1, 
On) ὦ ΞΊ, 


hence 


which is absurd. J 


208 Derivatives and Integrals 


THEOREM 5 


PROOF 


1) Ξ χ᾽ 


(a) 


fx) = Vx 


(b) 
FIGURE 15 


A simple example to which Theorem 4 applies is the function f(x) = x°. 
Since f/’(0) = 0 and ὁ = Γ (0), the function f~! is not differentiable at 0 
(Figure 15). 

Having decided where an inverse function cannot be differentiable, we are 
now ready for the rigorous proof that in all other cases the derivative is given 
by the formula which we have already ‘“‘derived”’ in two different ways. 
Notice that the following argument uses continuity of f—', which we have 
already proved. 


Let f be a continuous one-one function defined on an interval, and suppose 
that f is differentiable at f~1(6), with derivative f’(f—'(6)) # 0. Then f~! is 


differentiable at 6, and . 
1 
--] ͵7ὔ b a 2 
POF PH 


Let ὁ = f(a). Then 
ΓΕ ie 5: " ἘΞ) 0) 
Sj ee 


h0 h 
Now every number ὁ + ᾧ in the domain of f~! can be written in the form 
b+h=flatk) 


for a unique ἀ (we should really write k(h), but we will stick with ἃ for sim- 
plicity). Then 
a = 
h—-0 h 
ss oe 
= i Fa + 8) = 2 
mo fla+k) — ὁ 
k 
= lim ————__——____: 
no fla + k) — f(a) 
We are clearly on the right track! It is not hard to get an explicit expression 


for k; since 


b+h=f(at+k) 


we have 


fC Ph). Sak 


or 


ΞΘ τ ne) et 2 


Now by Theorem 3, the function f~! is continuous at ὁ. This means that αὶ 


Inverse Functions 209 


approaches Ὁ as h approaches 0. Since 


jim LAD τ AO) μῳ = fF) #0 


k-0 
this implies that 


1 


(Γ} 0) = 7G") 


The work we have done on inverse functions will be amply repaid later, but 
here is an immediate dividend. For n odd, let 


fae) Sx" for all. x5 
for n even, let 


F(a) eS 2: 

Then f, is a continuous one-one function, whose inverse function is 
En(x) = Vig = xn, 

By Theorem 5 we have, for x σέ 0, 


et en 

(0) 

-.-᾿ 

n( fn (x)? 
1 

ἡ χ n—1 

1 


xi In) 


8n (x) = 


2). 3s |— 


IRL, 


Thus, if f(x) = χα, and a is an integer or the reciprocal of a natural number, 
then f’(x) = ax*—!. It is now easy to check that this formula is true if a is any 


rational number: Let a = m/n, where m is an integer, and nis a natural num- 
ber; if 


(pe ee ers 


then, by the Chain Rule, 
Paley) = m(xlnym—l ; 1 P xA/inM-1 
n 
. χί τυ! π)ττ Cn) + (Gn) 1] 


ag (]2)— 1. 


s/3 sls 


210 Derivatives and Integrals 


Although we now have a formula for f’(x) when f(x) = χὰ and a is rational, 
the treatment of the function f(x) = x* for irrational a will have to be 
saved for later—at the moment we do not even know the meaning of a symbol 
like xV*. Actually, inverse functions will be involved crucially in the definition 
of χα for irrational a. Indeed, in the next few chapters several important func- 
tions will be defined in terms of their inverse functions. | 


PROBLEMS 


1͵ 


Find f~' for each of the following f. 


(1). fa) ott aed 
(ii) f(x) = (@ ~ 1)% 


shi pe es x rational 
a TaN | —x, x irrational. 
ἘΠ x > 0 
i Ci | 1 -- χ x <0. 
Xx, Ne Og ke ae τ ξῃ 
Vv) f@) = 4 44a = a, 2H 1,445, n— 1 
Qi, Xx = an. 
ivi) J) =e [a]. 
(vii) f(O.aiaca3 . . .) = O.aeaia3 . . . . (Decimal representation is be- 
ing used.) 
(viii) f(x) = το -1 <x <1. 
1 =" 


Describe the graph of [Γ᾿ when 


(i) fis increasing and always positive. 
(ii) f is increasing and always negative. 
(111) f is decreasing and always positive. 
(iv) f is decreasing and always negative. 


Complete the proof of Theorem 2. 


Prove that if f is increasing, then so is f—1, and similarly for decreasing 
functions. 


If f and g are increasing, is f+ σ ἢ Or f-¢g? Or fog? 


(a) Prove that if f and g are one-one, then fog is also one-one. Find 
(fog) ‘in terms of f~! and g~!. Hint: The answer is not f~10 ρ΄". 

(b) Find g™ in terms of f~! if g(x) = 1 + f(x). 
ax + ὁ 


οχ τ ἃ 
find {[Π in this case. 


Show that f(x) = is one-one if and only if ad — bc ¥ 0, and 


Inverse Functions 211 


8. On which intervals [a, δ} will the following functions be one-one? 


(» Fx) =a = 3." 
Gi) f(x) = χῦ -Ἐ x. 
(iit) f(x) = (1 + x?)™. 


a eek 
i= a 


9. Suppose that fis a one-one function and that f~! has a derivative which 
is nowhere 0. Prove that f is differentiable. Hint: There is a one-step 
proof. 

*10. (a) Prove that there is a differentiable function f such that [ f(x)]> + 
f(x) + χα = 0 for all x. Hint: Show that f can be expressed as an 
inverse function. ‘The easiest way to do this is to find [7 1. And the 
easiest way to do this is to set x = [ 10). 

(b) Find f’ in terms of f, using an appropriate theorem of this chapter. 
(c) Find Κ΄ in another way, by simply differentiating the equation 
defining ἢ. 


The function in Problem 10 is often said to be defined implicitly by the 
equation γ᾽ + y + x = 0. The situation for this equation is quite special, 
however. As the next problem shows, an equation does not usually define a 
function implicitly on the whole line, and in some regions more than one func- 
tion may be defined implicitly. 


11. (a) What are the two differentiable functions f which are defined 
implicitly on (—1,1) by the equation x? + y? = 1, 1.6., which 
satisfy x? + | f(x)]? = 1 for all x in (—1, 1)? Notice that there are 
no solutions defined outside [—1, 1]. 

(b) Which functions f satisfy x? + [f(x)]? = —1? 
*(c) Which differentiable functions / satisfy [ f(x)]’ — 3f(~) = χ ? Hint: 
It will help to first draw the graph of the function g(x) = x? — 3x. 


In general, determining on what intervals a differentiable function is 
defined implicitly by a particular equation may be a delicate affair, and is 
best discussed in the context of advanced calculus. If we assume that f is such 
a differentiable solution, however, then a formula for f(x) can be derived, 
exactly as in Problem 10(c), by differentiating both sides of the equation 
defining f (a process known as “implicit differentiation’’): 


12. (a) Apply this method to the equation [/f(x)|*? + x? = 1. Notice that 
your answer will involve f(x); this is only to be expected, since there 


is more than one function defined implicitly by the equation 
yet xt = 1. 


212 Derivatives and Integrals 


13. 


14. 


15. 


16. 


17. 
*18. 


(b) But check that your answer works for both of the functions f found 
in Problem 11 (a). 

(c) Apply this same method to [f(x)]® — 3f(x) = x. 

Leibnizian notation is particularly convenient for implicit differenti- 
ation. Because y is so consistently used as an abbreviation for f(x), the 
equation in x and y which defines f implicitly will automatically stand 
for the equation which f is supposed to satisfy. How would the following 
computation be written in our notation? 


Py ay = 4 


dy dy dy 
4γ3 — + 3y? = x— = 0, 
: dx =a dx δὰ “ἡ ax 
2 cE ee 
dx 4y%+ 3y?+ x 
As long as Leibnizian notation has entered the picture, the Leibnizian 
notation for derivatives of inverse functions should be mentioned. If 


_ dy/dx denotes the derivative of f, then the derivative of f~! is denoted by 


dx/dy. Write out Theorem 5 in this notation. The resulting equation will 
show you another reason why Leibnizian notation has such a large 
following. It will also explain at which point (f~')’ is to be calculated 
when using the dx/dy notation. What is the significance of the following 
computation? 


BS Des 
yx’, 
dxti" — dy 1 1 


Suppose that f is a differentiable one-one function with a nowhere zero 
derivative and that f = F’. Let G(x) = xf—1(x) — F(f—1(x)). Prove that 
G'(x) = f-4(x). (Disregarding details, this problem tells us a very inter- 
esting fact: if we know a function whose derivative is f, then we also 
know one whose derivative is f—!. But how could anyone ever guess the 
function G? Two different ways are outlined in Problems 14-10 and 
18-13.) 

Suppose ἡ is a function such that A’(x) = sin*(sin(x + 1)) and A(O) = 3. 
Find 


(i) ἀτ) Ὁ). 
(ii) (β΄ 2), where B(x) = hw + 1). 


Find a formula for (f~!)’’(x). 
Prove that if f®(f~'(x)) exists, and is nonzero, then (f~1)“)(x) exists. 


19. 


*20. 


aks 


7 


τ; 


Inverse Functions 213 


(a) Prove that an increasing and a decreasing function intersect at most 
once. 

(b) Find two continuous increasing functions f and g such that f(x) = 
g(x) precisely when x is an integer. 

(c) Find a continuous increasing function f and a continuous decreasing 
function g, defined on R, which do not intersect at all. 

(a) If fis a continuous function on R and f = f~!, prove that there is at 
least one x such that f(x) = x. (What does the condition f = f7} 
mean geometrically?) 

(Ὁ) Give several examples of continuous f such that f = f~! and f(x) = x 
for exactly one x. Hint: Try decreasing ἢ, and remember the geo- 
metric interpretation. One possibility is f(x) = —-x, 

(c) Prove that if f is an increasing function such that f = f~}, then 
f(x) = x for all x. Hint: Although the geometric interpretation will 
be immediately convincing, the simplest proof (about 2 lines) is to 
rule out the possibilities f(x) < χ and f(x) > x. 

Which functions have the property that the graph is still the graph of a 

function when reflected through the graph of — J (the “‘antidiagonal’’) ἢ 

A function f is nondecreasing if f(x) < f(y) whenever x < y. (To be 

more precise we should stipulate that the domain of f be an interval.) 

A nonincreasing function is defined similarly. Caution: Some writers 

use “‘increasing”’ instead of “‘nondecreasing,” and “‘strictly increasing”’ 

for our “increasing.” 


(a) Prove that if fis nondecreasing, but not increasing, then f is constant 
on some interval. (Beware of unintentional puns: “ποῖ increasing” 
is not the same as “‘nonincreasing.’’) 

(b) Prove that if f is differentiable and nondecreasing, then f’(x) > 0 
for all x. 

(c) Prove that if f’(x) > 0 for all x, then f is nondecreasing. 


(a) Suppose that f(x) > 0 for all x, and that f is decreasing. Prove that 
there is a continuous decreasing function g such that 0 < g(x) < f(x) 
for all x. 

(b) Show that we can even arrange that g will satisfy lim g(x)/f(x) = 0. 


FIGURE 1 


(a, 0) 


CHAPTER 


FIGURE 2 


INTEGRALS 


The derivative does not display its full strength until allied with the “integral,” 
the second main concept of Part III. At first this topic may seem to be a com- 
plete digression—in this chapter derivatives do not appear even once! The 
study of integrals does require a long preparation, but once this preliminary 
work has been completed, integrals will be an invaluable tool for creating new 
functions, and the derivative will reappear in Chapter 14, more powerful than 
ever. 

Although ultimately to be defined in a quite complicated way, the integral 
formalizes a simple, intuitive concept—that of area. By now it should come 
as no surprise to learn that the definition of an intuitive concept can present 
great difficulties—‘‘area”’ is certainly no exception. 

In elementary geometry, formulas are derived for the areas of many plane 
figures, but a little reflection shows that an acceptable definition of area is 
seldom given. The area of a region is sometimes defined as the number of 
squares, with sides of length 1, which fit in the region. But this definition 15 
hopelessly inadequate for any but the simplest regions. For example, a circle 
of radius 1 supposedly has as area the irrational number 7, but it is not at all 
clear what “π᾿ squares’? means. Even if we consider a circle of radius 1/ Vr, 
which supposedly has area 1, it is hard to say in what way a unit square fits in 
this circle, since it does not seem possible to divide the unit square into pieces 
which can be arranged to form a circle. 

In this chapter we will only try to define the area of some very special 
regions (Figure 1)—those which are bounded by the horizontal axis, the verti- 
cal lines through (a, 0) and (ὁ, 0), and the graph of a function f such that 
f(x) > 0 for all x in [a, b]. It is convenient to indicate this region by R(f, a, δ). 
Notice that these regions include rectangles and triangles, as well as many 
other important geometric figures. 

The number which we will eventually assign as the area of R(f, a, δ) will be 
called the integral of f on [a, 6]. Actually, the integral will be defined even for 
functions f which do not satisfy the condition f(x) > 0 for all x in ἴα, δ). If f is 
the function graphed in Figure 2, the integral will represent the difference of 
the area of the lightly shaded region and the area of the heavily shaded 
region (the ‘‘algebraic area” of R(f, a, b)). 

The idea behind the prospective definition is indicated in Figure 3. The 
interval [a, ὁ] has been divided into four subintervals 


[to, ἐ1} [t1, to] [22, 4] [8,14] 
by means of numbers ἕο, é1, te, ¢3, ἐφ with 

ατΞίρ «ἢ <te<ts< ty = ὁ 
214 


FIGURE 3 


DEFINITION 


DEFINITION 


Integrals 215 


(the numbering of the subscripts begins with 0 so that the largest subscript 
will equal the number of subintervals). 

On the first interval [to, f:] the function f has the minimum value m and the 
maximum value M;; similarly, on the th interval [¢;1, 4] let the minimum 
value of f be σι; and let the maximum value be M;. The sum 


s = m(ty — to) + mo(te — ty) + m3(t3 — te) + my(ts — ts) 


represents the total area of rectangles lying inside the region R(f, a, 6), while 
the sum 


S = Mi(t, — to) + Malte — ty) + M3(tz3 — te) + M(t, — ts) 


represents the total area of rectangles containing the region R(f, a, 6). The 
guiding principle of our attempt to define the area A of R(f, a, b) is the obser- 
vation that A should satisfy 


s<A and A<S, 


and that this should be true, no matter how the interval (a, δ] is subdivided. It is to 

be hoped that these requirements will determine A. The following definitions 

begin to formalize, and eliminate some of the implicit assumptions in, this 

discussion. 

yi Ne rg SP ὦ Ne a δες ες 
Let a < ὁ. A partition of the interval ἴα, δ) is a finite collection of points in 


[a, 6], one of which is a, and one of which is ὁ. 
a ςσ..ς..---- 5-....-.-- τ --- ᾿- - ------ πππππ β ὄΡρ 


The points in a partition can be numbered fo, . . . 5 fn so that 
PB Se Se ee See 


we shall always assume that such a numbering has been assigned. 


Suppose f is bounded on [a, δ] and P = jo, ..- ; t,} is a partition of 


[a, ὁ]. Let 


71 = inf | f(x): aa < x < t;}, 
M; = sup {f(x): ἐπα Φ χ < &}. 


The lower sum of f for P, denoted by L(f, P), is defined as 


L(f, P) = ) mit: — ἕνα). 
2 
The upper sum of f for P, denoted by U (f, P), is defined as 


U(f, P) = > Milt; — 4-1). 


The lower and upper sums correspond to the sums s and S in the previous 


216 Derivatives and Integrals 


FIG URE 4 


FIGURE 5 


LEMMA 


PROOF ᾿ 


example; they are supposed to represent the total areas of rectangles lying 
below and above the graph of f. Notice, however, that despite the geometric 
motivation, these sums have been defined precisely without any appeal to a 
concept of ‘‘area.”’ 

Two details of the definition deserve comment. The requirement that f be 
bounded on ἴα, ὁ] is essential in order that all the σι; and M; be defined. Note, 
also, that it was necessary to define the numbers σι; and M; as inf’s and sup’s, 
rather than as minima and maxima, since f was not assumed continuous. 

One thing is clear about lower and upper sums: If P is any partition, then 


LY, P) = UG, 2), 
because 


Lf, P) = δ miltr — ta), 


U(f, P) = > Mit = ta), 


and for each z we have 
mi(t; — toi) < Milt; — t-1). 


On the other hand, something less obvious ought to be true: If Pi and Pe are 
any two partitions of [a, δ], then it should be the case that 


L({f, Pi) < Uf, Pa), 


because L(f, Pi) should be < area R(f, a, 6), and U(f, P2) should be > area 
R(f, a, δ). This remark proves nothing (since the “‘area of R(f, a, δ) has not 
even been defined yet), but it does indicate that if there is to be any hope of 
defining the area of R(f, a, 6), a proof that L(f, δι) < U(f, 4) should come 
first. The proof which we are about to give depends upon a lemma which 
concerns the behavior of lower and upper sums when more points are included 
in a partition. In Figure 4 the partition P contains the points in black, and Q 
contains both the points in black and the points in grey. The picture indicates 
that the rectangles drawn for the partition Q are a better approximation to the 
region R(f, a, δ) than those for the original partition P. To be precise: 


If Q contains P (1.e., if all points of P are also in Q), then 


L(f, P) < Lf, Q), 
U(f, P) Σ Uf, 9). 


Consider first the special case (Figure 5) in which Q contains just one more 
point than P: 


| ne ee 
OS oe Bg ey Ce oe dat 
where 
a=ipa hao ρα ae yee So. 


THEOREM 1 


PROOF 


Integrals 217 


Let 
m’ = inf { f(x): oi <x < αἱ, 
m= inf {.(.): a SS ty}. 


| 


Then 
Lf, P) τ δ. mili - ἐς. ἡ, 


k-1 n 
Lf, Q) = 2 m(t; — t1) + om’ (u — tf) +m" (th — ὦ + Σ m;(tj — ἔς..4). 


f<¢+1.. 
To prove that L(f, P) < L(f, Q) it therefore suffices to show that 


ult Shea) Sm = Gea) Pe ie =): 


Now the set { f(x): ἐκ... < x < με] contains all the numbers in { f(x): ἐμ < 
x <u}, and possibly some smaller ones, so the greatest lower bound of the 
first set is less than or equal to the greatest lower bound of the second; thus 


Mk < γη΄. 
Similarly, 
mr << ι΄. 


Therefore, 
mx(th — th—1) = my(u — ἐς...) + πικ(ὶς — u) < m'(u — ipa) ἘΠῚ (5.55 τῇ. 


This proves, in this special case, that L(f, P) < L(f, Q). The proof that 
U(f, P) > Uf, Q) is similar, and is left to you as an easy, but valuable, 
exercise. | 

The general case can now be deduced quite easily. The partition Q can be 
obtained from P by adding one point at a time; in other words, there is a 
sequence of partitions 


P= P,, Po, δ 8 τν , P, = Q 
such that P;,; contains just one more point than P,;. Then 
Lif, P) = Lf, Pi) SLY, Px) S τ SL Pa) = LS, ὁ). 
and 
OG, FY SUG, Pi) ΞΟ Uf Pa) ee ἘΣ UG Pe) = OG Qe 


The theorem we wish to prove is a simple consequence of this lemma. 


Let P, and P» be partitions of [a, 6], and let f be a function which is bounded on 
(a, ὁ]. Then 
L(f, Pi) < Uf, P2). 


There is a partition P which contains both P; and P» (let P consist of all points 
in both P; and P.). According to the lemma, 


L(f, Pi) Ξ L(f, P) < νῷ, P) < US, Po). I 


218 Derivatives and Integrals 


FIGURE 6 


It follows from Theorem 1 that any upper sum U(f, P’) is an upper bound 
for the set of all lower sums L(f, δ). Consequently, any upper sum U(f, P’) is 
greater than or equal to the /east upper bound of all lower sums: 

sup {L(f, P): P a partition of [α, b]} < σῷ, P’), 
for every P’. This, in turn, means that sup {Z(f, P)} is a lower bound for the 
set of all upper sums of f. Consequently, 
sup {L(f, P)} < inf {U(Y, P)}. 


It is clear that both of these numbers are between the lower sum and upper 
sum of f for all partitions: 


L(f, P’) < sup {L(f, P)} < σῷ, P’), 
L(f, P’) < inf {U(f, P)} < US, P), 


for all partitions P’. 
It may well happen that 


sup {L(f, P)} = inf [σύ P}; 


in this case, this is the on/y number between the lower sum and upper sum of 
f for all partitions, and this number is consequently an ideal candidate for the 
area of R(f, a, δ). On the other hand, if 


sup {L(f, P)} < inf {U(f, P)}, 
then every number x between sup {L(f, P)} and inf {U(f, P)} will satisfy 
Lf, P)<«x< US, P’) 
for all partitions P’. 
It is not at all clear just when such an embarrassment of riches will occur. 


The following two examples, although not as interesting as many which will 
soon appear, show that both phenomena are possible. 


Suppose first that f(x) = c for all x in [a, ὁ] (Figure 6). If P = {t, .. . , tn} 
is any partition of [a, ὁ], then 
n= ΜΗ, ΞΞ-. ὦ, 
80 
Lf, P) = c(t; — ty) = c(b — a), 


il 
i 


1 


U(f, P) = c(t; — ἐμ. 4) = c(b — a). 


Mm: 


ΠΣ 
ll 
— 


In this case, all lower sums and upper sums are equal, and 
sup {L(f, P)} = inf {U(f, P)} = ε(ὁ — a). 
Now consider (Figure 7) the function f defined by 


_ | 0, x irrational 
A Ξ | 1, x rational. 


FIGURE 7 


DEFINITION 


Integrals 219 


If P = {t, ..., t,} is any partition, then 

m; = 0, since there is an irrational number in [t:—1, te], 
and 

Μ; = 1, since there is a rational number in [¢;-1, ἐν]. 
Therefore, 


L(f, P) = > Oe ree ae eae) 
Uf, P)= Yi ta) =b-a, 


Thus, in this case it is certainly not true that sup {L(/, P)} = inf LUGE Py: 
The principle upon which the definition of area was to be based provides 
insufficient information to determine a specific area for R(f, a, 6)—any num- 
ber between 0 and ὁ — a seems equally good. On the other hand, the region 
R(f, a, δ) is so weird that we might with justice refuse to assign it any area at 
all. In fact, we can maintain, more generally, that whenever 


sup {L(f, P)} γέ inf {UC P)}, 


the region R(f, a, δ) is too unreasonable to deserve having an area. As our 
appeal to the word “‘unreasonable”’ suggests, we are about to cloak our 
ignorance in terminology. 


A function f which is bounded on [a, ὁ] is integrable on [a, ὁ] if 
sup {L(f, P): Pa partition of [a, 6]} = inf {U(f, P): Pa partition of [a, 6]}. 


In this case, this common number is called the integral of fon [a, ὁ) and 
is denoted by 
I i ᾿ 


(The symbol f is called an integral sign and was originally an elongated s, for 
“sum;”? the numbers a and ὦ are called the lower and upper limits of integration. ) 


The integral [ f is also called the area οἵ ΚΓ, a, δὴ) when f(x) > 0 forall x 
in [a, ὁ]. 


If f is integrable, then according to this definition, 
Lif, P) < ͵. f < U(f, P) for all partitions P of [α, 6]. 


Moreover, . f is the unique number with this property. 

This definition merely pinpoints, and does not solve, the problem discussed 
before: we do not know which functions are integrable (nor do we know how 
to find the integral of f on [a, δ] when f 1s integrable). At present we know only 


220 Derivatives and Integrals 


THEOREM 2 


PROOF 


two examples: 
(1) if f(x) = c, then f is integrable on [a, ὁ] and πὶ; Ξε δ’ (ὁ -- a). 


(Notice that this integral assigns the expected area to a rectangle.) 


. 0, «x irrational : : 
2 ΞΞ : . 
(2) if f(x) a ὐπρθθεῖ then f is not integrable on [a, ὁ] 


Several more examples will be given before discussing these problems fur- 
ther. Even for these examples, however, it helps to have the following simple 
criterion for integrability stated explicitly. 


If f is bounded on [a, 6], then f is integrable on [a, ὁ] if and only if for every 
€ > 0 there is a partition P of [a, 6] such that 


Suppose first that for every ¢ > 0 there is a partition P with 


U(f, P) — Lf, P) « ε. 
Since 
inf {U(f, P’)} < UG, P), 


sup LU. Pt SLU P), 
it follows that 


inf {U(f, P’)} — sup {L(f, P’)} < «. 
Since this is true for all ¢ > 0, it follows that 


sup {L(f, P’)} = inf σῷ, P’)}; 


by definition, then, / is integrable. The proof of the converse assertion is simi- 
lar. If f is integrable, then 


sup {L(f, P)} = inf {U(f, P)}. 
This means that for each € > 0 there are partitions P’, P’’ with 
U(f, P’) — Lf, P’) <«. 


Let P be a partition which contains both P’ and P’’. Then, according to the 
lemma, 

GG Py SOF Ph"), 

L(f, P) > Lf, P’); 
consequently, 


UG, £) =i 1...) < OG Po) =, FP) «ε.} 


Although the mechanics of the proof take up a little space, it should be clear 
that Theorem 2 amounts to nothing more than a restatement of the definition 


Integrals 221 


of integrability. Nevertheless, it is a very convenient restatement because there 
is no mention of sup’s and inf’s, which are often difficult to work with. The 
next example illustrates this point, and also serves as a good introduction to 
the type of reasoning which the complicated definition of the integral necessi- 
tates, even in very simple situations. 

Let f be defined on [0, 2] by 


_{0, x41 
he sees 
Suppose P = {fo, . . . , ἐμ} is a partition of [0, 2] with 
fa < 1< t; 


(see Figure 8). Then 


m=M,=0 if 15}. 


FIGURE 8 but 
m;=0 and M;=1. 
Since 
2-: sm 
Lif, P) = > Midi 124) mG = Gea) > Mig = ἐς...), 
p=] t=j+l 
21--1 
Ulf, P) = Σ᾿ Milt: — tea) + Μά — ba) ἘΠῚ Μία — ιρὺ), 
i=1 (=j+1 
we have 


U(f, P) — Lf, P) = ty — ty-1. 
This certainly shows that f is integrable: to obtain a partition P with ~ 
U(f, P) — Lf, P) < & 
it is only necessary to choose a partition with 
boas 1S tp ANG. ἢ; -- ἐμ « ξ. 
Moreover, it 15 clear that 


L(f, P) < 0 < U(f, P) for all partitions P. 


Since f is integrable, there is only one number between all lower and upper 
sums, namely, the integral of f, so 


[1Ξο. 


Although the discontinuity of f was responsible for the difficulties in this 
example, even worse problems arise for very simple continuous functions. For 
example, let f(x) = x, and for simplicity consider an interval [0, b|, where 
b> 0. It P= {t, ... , tn} is a partition of [0, 5], then (Figure 9) 


FIGURE 9 


m; = ἐμ. and M; = 1; 


222 Derivatives and Integrals 


and therefore 


Lif, P) = δ τ χ(ᾧ — #1) 
δὲ ' ' 
= to(ty = to) + ti(te a th) + , πο + tn—1(tn ea a), 
U(f, P) = t(t; — t_1) 
— 


= t1(t1 ~— to) + te(te — ἃ) +t °° - taltn — tpi). 


Neither of these formulas is particularly appealing, but both simplify con- 
siderably for partitions P, = {to, . . . , tn} into n equal subintervals. In this 
case, the length ἐς — ¢;_1 of each subinterval is b/n, so 


25 
fg = —> etc; 
n 


in general, 


Then 


L(f, Pa) = » ty_i(t; — t1) 


Remembering the formula 


1 + πον; οὖ + k = Kk + 1) 
2 
this can be written 
(n —1)(n) 36? 
L(f, Pn pee 
Ge ) 2 n? 
n—1 6? 


ee le 
— 


FIGURE 10 


FIGURE 11 


Integrals 223 


Similarly, 


U(f, Pn) 


I 


» κί — ἐς ἡ 


t=1 


n 


-yr. 
Lyn n 


+=1 


_nn+1) 8? 
725 n? 
ae ete 1.3}: 
oe τὴν oe. 


If nis very large, both L(f, P,) and U(f, Pn) are close to 67/2, and this remark 
makes it easy to show that f is integrable. Notice first that 

2 δ' 

n 2 

This shows that there are partitions P, with U(f, Pn) — L(f, Pn) as small as 


desired. By Theorem 2 the function f is integrable. Moreover, [ ° f may now 


U(f, Ρ,) " Lif, Pr) τς 


be found with only a little work. It is clear, first of all, that 
2 
Ke oe ae Ξ - < U(f, Pn) for all n. 


This inequality shows only that 52/2 lies between certain special upper and 
lower sums, but we have just seen that U(f, Pn) — L(f, Pn) can be made as 
small as desired, so there is only one number with this property. Since the 
integral certainly has this property, we can conclude that 


b b2 
Lt? 


Notice that this equation assigns area 52/2 to a right triangle with base and 
altitude ὁ (Figure 10). Using more involved calculations, or appealing to 
Theorem 4, it can be shown that 


b hb? α3 
δ ae) 
The function f(x) = x? presents even greater difficulties. In this case 
(Figure 11), if P = {to, . . . , ¢n} is a partition of [0, 6], then 
nm = Κι.) ΞΞ (t;-1)? and MM; = Ff (ts) = 17. 
Choosing, once again, a partition P, = {to, . . . , fn} into π equal parts, 
so that 
2° b 
ty = ? 


224 Derivatives and Integrals 


the lower and upper sums become 


nr 


L(f, Pn) = (t; 1)? * (ὦ πο ἐμ. 4) 

11 
- 2 

οὐ αν 
, πη on 
t=] 
43 n—l 

= = : 73, 

7=0 


U(f, Pn) = » i Ye) 


Recalling the formula 
Peo $= SRE + 1)(2k + 1) 


from Problem 2-1, these sums can be written as 


Ca re 2 EADOCr= 1, 


Ὁ 
id [Δ] 


| 


U(f, Pr) = = πᾷ + 1)(2n + 1). 


It is not too hard to show that 


L(f, Px) < ᾿ < U(f, Pr); 


and that U(f, Pn) — L(f, Pn) can be made as small as desired, by choosing n 
sufficiently large. The same sort of reasoning as before then shows that 


b 03 


This calculation already represents a nontrivial result—the area of the region 
bounded by a parabola is not usually derived in elementary geometry. Never- 
theless, the result was known to Archimedes, who derived it in essentially the 
same way. The only superiority we can claim is that in the next chapter we 
will discover a much simpler way to arrive at this result. 


Integrals 225 


Some of our investigations can be summarized as follows: 


f=e-(6—a) if 1() Ξ for all x, 
ὃ ὁδῤ'΄ κα - for all 
as if f(x) =  χ for all x, 
ὃ 3 3 
[r= 5-5 if f(x) = x? for all x. 


This list already reveals that the notation [ f suffers from the lack of ἃ con- 


venient notation for naming functions defined by formulas. For this reason an 
alternative notation,* analogous to the notation lim f(x), is also useful: 
wa 
b , b 
if f(x) dx means precisely the same as ! fe 
Thus 


b 
| cdx τὸ ο' (Ὁ —a), 


Notice that, as in the notation lim f(x), the symbol x can be replaced by any 


I—a 


other letter (except f, a, or ὁ, of course); 


[τὸ ae = [P70 de = [Κῶ da -- [10) ὦ = [PFO ae. 


The symbol dx has no meaning in isolation, any more than the symbol x— 
has any meaning, except in the context lim f(x). In the equation 


t~a 


b ῥ' κεαϑ 
ΡΞ ΟΝ ΕΝ πα 
[= x 3 3 


the entire symbol x? dx may be regarded as an abbreviation for: 


the function f such that f(x) = x? for all x. 


b 
* The notation (x) dx is actually the older, and was for many years the only, symbol for 
. Υ̓ 


the integral. Leibniz used this symbol because he considered the integral to be the sum 
(denoted by f) of infinitely many rectangles with height f(x) and “infinitely small” width dx. 


Later writers used xo, . . . , Xn to denote the points of a partition, and abbreviated x; — xi-1 
nr 


by Ax;. The integral was defined as the limit as Ax; approaches 0 of the sums Σ [(ὺ Δα, 

i=l 
(analogous to lower and upper sums). The fact that the limit is obtained by changing Σ to f, 
f(x) to f(x), and Ax; to dx, delights many people. 


226 Derivatives and Integrals 


This notation for the integral is as flexible as the notation lim f(x). Several 


ra 
examples may aid in the interpretation of various types of formulas which 
frequently appear; we have made use of Theorems 5 and 6.* 


b δ δ 02 oat 
() [«τῦώ-- [ταν [γχα =F-F +90 - ὦ. 


@ fotog= frat [τῳ -ξ- τα - ὦ. 


(3) [(fatoe)a= Patoe-aa 


-ατὸ [α - ὦ dx 


-ατ - - ὦ - ὦ] 


(4) [(f@t+no)e= [μὰ - τ -Ξ ἃ 
-((-ἢδο- τα -ὸ [τὰ 


The computations of Ι * x dx and fr x? dx may suggest that evaluating 


integrals is generally difficult or impossible. As a matter of fact, the integrals 
of most functions are impossible to determine exactly (although they may be com- 
puted to any degree of accuracy desired by calculating lower and upper sums). Neverthe- 
less, as we shall see in the next chapter, the integral of many functions can be 
computed very easily. 

Even though most integrals cannot be computed exactly, it is important 
at least to know when a function f is integrable on [a, δ]. Although it is possible 
to say precisely which functions are integrable, the criterion for integrability 
is a little too difficult to be stated here, and we will have to settle for partial 
results. The most useful result will be stated now, but proved at the end of the 
next chapter. 


THEOREM 3 __If f is continuous on [a, 4], then f is integrable on [α, 6]. 


(Notice that it is unnecessary to assume that f is bounded on [a, δ], because of 
continuity.) 
* Lest chaos overtake the reader when he reads other books, equation (1) requires an impor- 


boa b 
tant qualification. This equation interprets [ y dx to mean the integral of the function f such 
a 


b 
that each value f(x) is the number y. But classical notation often uses y for y(x), so Ι γ ἀχ 


might mean the integral of some arbitrary function y. 


FIGURE 12 


THEOREM 4 


PROOF 


Integrals 227 


Although this theorem will provide all the information necessary for the use 
of integrals in this book, it is more satisfying to have a somewhat larger supply 
of integrable functions. Several problems treat this question in detail. It will 
help to know the following three theorems, which show that f is integrable on 
[a, 5], if it is integrable on [a, c] and [c, ὁ]; that f + gisintegrable iff and g are; 
and that c: f is integrable if f is integrable and c is any number. 

As a simple application of these theorems, recall that if fis 0 except at one 
point, where its value is 1, then f is integrable. Multiplying this function by ¢, it 
follows that the same is true if the value of f at the exceptional point is c. 
Adding such a function to an integrable function, we see that the value of an 
integrable function may be changed arbitrarily at one point without destroy- 
ing integrability. By breaking up the interval into many subintervals, we see 
that the value can be changed at finitely many points. 

The proofs of these theorems usually use the alternative criterion for inte- 
grability in Theorem 2; as some of our previous demonstrations illustrate, the 
details of the argument often conspire to obscure the point of the proof. It is a 
good idea to attempt proofs of your own, consulting those given here as a 
last resort, or as a check. This will probably clarify the proofs, and will cer- 
tainly give good practice in the techniques used in some of the problems. 


Leta <c < ὃ. If f is integrable on [a, 6], then f is integrable on [a, c] and on 
[c, 6]. Conversely, if f is integrable on [a, c] and on [c, δ], then f is integrable on 
(a, ὁ]. Finally, if f is integrable on [a, ὁ], then 


ἌΣ 


Suppose f is integrable on [a, 5]. If¢ > 0, there 15 ἃ partition P = {t,,... , ¢,} 
of [a, ὁ] such that 
U(f, P) — Lf, P) < ε. 


We might as well assume that c = ἐ; for some 7. (Otherwise, let Q be the 


partition which contains to, . . . , ἔῃ and 6; then Q contains P, so U(f, Q) -- 
LQ): < UG, P) ΞΡ ΡΣ ΕΣ 
Now P’ = {t,,. . . , ἢ} is a partition of [a, 6] and P” = {t;,..- ; tn} is 


a partition of [c, ὁ] (Figure 12). Since 


L(f, P) = L(f, P’) + Lif, P’), 
U(f, P) = Uf, P’) + UGE, P’); 


we have 
UY, P’) — LU, PY) + (UG, Ρὴ πιο ΡῚ = UG, P) -ὀἨε ἢ «ε. 


Since each of the terms in brackets is nonnegative, each is less than ¢. ‘This 
shows that f is integrable on [a, c] and [c, 5]. Note also that 


Lf, P’) < [i f< UL P); 
LPs 2 Ξ P), 


228 Derivatives and Integrals 


THEOREM 5 


PROOF 


so that 


b ς 
ΤΟ PS] fe) FS OGD). 
Since this is true for any P, this proves that 


[iit fit fF 


Now suppose that f is integrable on [a, c] and on [c, 6]. If ¢ > 0, there is a 
partition P’ of [a, c| and a partition P”’ of [c, 6] such that 


OG, 1) — ΔΕ ΡῚ « Ε7.2. 
LG ES oe ae ΟΡ oe 


If P is the partition of [a, ὁ] containing all the points of P’ and P’’, then 


Lit) = 1 Pe Le), 
fy = UG PUG); 


consequently, 
OL?) = LF) = (UG, Py aH LP) (UG Ph) ΞΕ ΡΞ 0 


Theorem 4 is the basis for some minor notational conventions. The integral 
b ae 
[ f was defined only for a < 6. We now add the definitions 


[of =0 and ΓΞΞΙΣ if a > ὁ. 


With these definitions, the equation Ἰ τ ᾿ ° f= [ f holds for all a, c, ὁ 


even if a <c < ὁ is not true (the proof of this assertion is a rather tedious 
case-by-case check). 


If f and g are integrable on [a, ὁ], then f + g is integrable on [a, ὁ] and 
b b b 
[G¢+ea=frt fre 


Let P = {to, . . . , tn} be any partition of [a, 6]. Let 


m= inf {(f + g)(x): $is*«s t;\, 
γι; = inf { f(x): t1 < x < 4}, 
γι," = inf {g(x): ἐὺ ἧς ἃ ἃς t,}, 


and define M,, M;’, Δ΄, similarly. It is not necessarily true that 
πὶ; =m + περ", 
but it is true (Problem 7) that 


πὶ; > mo + πὶ. 
Sunilarly, | 
M; 


IA 


Μ, + Μ,". 


THEOREM 6 


PROOF 


Integrals 229 


‘Therefore, 

Lf, P) + L(g, P) s LG +8, P) 
and 

U(f +g, P) < Ulf, P) + Ug, P). 
Thus, 


Lf, P) + L(g, P) < Lif +a, P) < Uf +s, P) < σῷ, P) + Ue, P). 
Since f and g are integrable, there are partitions P’, P’’ with 
U(f, P’) — Lf, P’) < &/2, 
U(g, 5) — L(g, P”’) < ¢/2. 
If P contains both P’ and β΄, then 
U(f, P) + U(g, P) — (LA, P) + L(g, P< & 


and consequently 


This proves that f + g is integrable on [a, ὁ]. Moreover, 
(1) Lif, δὴ τ L(g, P) Ξ Lf + 8, Ὁ) 
< Ἂ ξεν. 
« ὕ({- 5, Ρ) < US Ρὴ + Us, δ); 
and also 
2) LY,P) +L?) « [1-Ὸ᾽ [εξ UEP) + Ue P). 
Since U(f, P) — L(f, P) and U(g, p) — L(g, P) can both be made as small as 
desired, it follows that 
U(f, P) + Ulg, P) — (LU, P) + L(g; PY] 


can also be made as small as desired; it therefore follows from (1) and (2) that 
ὃ b b 
fg¢teo=firtfeet 


If f is integrable on ἴα, δ], then for any number c, the function ¢f is integrable 
on [a, 6] and 
b b 
fe cf =c- i 1. 


The proof (which is much easier than that of Theorem 5) is left to you. It isa 
good idea to treat separately the cases c > 0 and c < 0. Why? J 


(Theorem 6 is just a special case of the more general theorem that f: g 15 
integrable on [a, 6], if f and g are, but this result is quite hard to prove 
(see Problem 32).) 

In this chapter we have acquired only one complicated definition, a few 
simple theorems with intricate proofs, and one theorem with no proof at all. 


230 Derivatives and Integrals 


FIGURE 13 


THEOREM 7 


PROOF 


)- ὦ 


THEOREM 8 


PROOF 


This is not because integrals constitute a more difficult topic than derivatives, 
but because powerful tools developed in previous chapters have been allowed 
to remain dormant. ‘The most significant discovery of calculus is the fact that 
the integral and the derivative are intimately related—once we learn the con- 
nection, the integral will become as useful as the derivative, and as easy to use. 
The connection between derivatives and integrals deserves a separate chapter, 
but the preparations which we will make in this chapter may serve as a hint. 
We first state a simple inequality concerning integrals, which plays a role in 
many important theorems. 


Suppose f is integrable on [a, 5] and that 


γι “ f(x) < M for all x in [a, 6]. 
Then 
mb —a) < ff < Μύ -- a). 


It is clear that 
mb —a) <L(f, P) and U(f, P) < M(b —a) 
for every partition P. Since Ι ᾿ f = sup {L(f, P)} = inf {U(f, P)}, the desired 


inequality follows immediately. J 


Suppose now that f is integrable on [a, ὁ]. We can define a new function F 
on [a, 6] by 
F(x) = [f= [ Κὸ αι 


(This depends on Theorem 4.) We have seen that f may be integrable even if 
it is not continuous, and the Problems give examples of integrable functions 
which are quite pathological. The behavior of F is therefore a very pleasant 
surprise. 


If f is integrable on [a, ὁ] and F is defined on [a, 4] by 
F(x) = [*f, 
then F is continuous on fa, ὁ]. 


Suppose ς is in [a, δ]. Since f is integrable on [a, 6] it is, by definition, bounded 
on [a, 6]; let M4 be a number such that 


\f(x)| <M for all x in [a, δ]. 
If A > 0, then (Figure 13) 
bh) SFO) Sf pS |p ay. 


Since 
—M < f(x) < M for all x, 


Integrals 231 


FIGURE 14 


it follows from Theorem 7 that 
“μιὰς [S$ Mh; 
in other words, 
(1) —-M:h< Fe +h) -- Fo) 5 Μ'1. 


232 Derivatives and Integrals 


If h < 0, a similar inequality can be derived. Note that 


Fe +h) — FO = ['" f= - fl pt 


Applying Theorem 7 to the interval [c + A, c], of length —hA, we obtain 


Mh < [" ,f < —Mh; 


multiplying by —1, which reverses all the inequalities, we have 


(2) Mh< Flee +h) — Ε(ὼ < —Mh. 


Inequalities (1) and (2) can be combined: 


Το + h) — Ε(Ω] < Μ΄ |Al. 


Therefore, if ¢ > 0, we have 


με + ἃ) — F(o)| < ¢, 


provided that [ἡ] « ¢/M. This proves that 


lim F(c + Δ) = F(c); 
h-0 


in other words F is continuous at c. ἢ 


Figure 14 compares f and F(x) = [: f for various functions /; it appears 


that F is always better behaved than f. In the next chapter we will see how 
true this is. 


PROBLEMS 


1; 


τ. 


Prove that i x* dx = b*/4, by considering partitions into n equal sub- 


intervals, using the formula for Σ 13 which was found in Problem 2-5. 
t=) 

This problem requires only a straightforward imitation of calculations 

in the text, but you should write it up as a formal proof to make certain 


that all the fine points of the argument are clear. 
Prove, similarly, that ἐν χί dx = ῥ575. 


(a) Using Problem 2-6, show that > ἀρ /n®*1 can be made as close 
k=l 
to 1/(p + 1) as desired, by choosing n large enough. 
(b) Prove that i xP dx = bPt1/(p + 1). 
Decide which of the following functions are integrable on [0, 2], and 
calculate the integral when you can. 


FIGURE 15 


5, 


Integrals 233 


ji) fay =) _, 
Gi) fo) =x toh 
Ga). Fre : + [x], χ rational 


x irrational. 

1, x of the form a + 6 V2 for rational a and ὁ 
0, x not of this form. 
ἘΠΕ O<x<1 
wi) 70) = 11} 

x 

0, χΞΘΟΡΧ)Ί. 

(vii) f is the function shown in Figure 15. 


ὁ) st) = | 


Find the areas of the regions bounded by 


2 
(i) the graphs of f(x) = x? and g(x) = - 
(ii) the graphs of f(x) = x? and g(x) = —x?® and the vertical lines 
through (—1, 0) and (1, 0). 

(iii) the graphs of f(x) = x? and g(x) = 1 — x’. 

(iv) the graphs of f(x) = x? and g(x) = 1 — x? and A(x) = 2. 

(v) the graphs of f(x) = x? and g(x) = x? — 2x + 4 and the vertical 
axis. 

(vi) the graph of f(x) = Vx, the horizontal axis, and the vertical line 
through (2, 0). (Don’t try to find ik Vx dx; you should see a way 


of guessing the answer, using only integrals that you already know 
how to evaluate. The questions that this example should suggest are 
considered in Problem 16.) 


Find 
[CE #0) ὦ) ὦ 


ὃ ἃ . . ., . ΓῚ . 
in terms of | fand i g. (This problem is an exercise in notation, with 
a c 


a vengeance; it is crucial that you recognize a constant when it appears.) 
Prove, using the notation of Theorem 5, that 


γι, + μι," = inf { f(x1) + 2(X2): bgt « X1, %2 < t;} < m;. 


(a) Which functions have the property that every lower sum equals 
every upper sum? 

(b) Which functions have the property that some upper sum equals 
some (other) lower sum? 

(c) Which continuous functions have the property that all lower sums 
are equal? 


234 


Derivatives and Integrals 


10. 


11. 


"42: 


*13. 


14. 


1.5. 


*(d) Which integrable functions have the property that all lower sums 
are equal? (Bear in mind that one such function is f(x) = 0 for «x 
irrational, f(x) = 1/q for x = p/q in lowest terms.) Hint: You will 
need the notion of a dense set, introduced in Problem 8-6, as well as 
the results of Problem 21. 


Ifa «ὁ «- « dand f is integrable on [a, d], prove that f is integrable 
on [b, c]. (Don’t work hard.) 


(a) Prove that if f is integrable on [a, δ] and f(x) > 0 for all x in [a, δ], 
then f, °F > 0, 

(Ὁ) Prove that if f and g are integrable on [a, 6] and f(x) > g(x) for all x 
in [a, δ], then in ΤΣ ie g. (By now it should be unnecessary to 
warn that if you work hard on part (b) you are wasting time.) 

Prove that 


° fa) dx = [76 — ὃ ae 
+ 


(The geometric interpretation should make this very plausible.) Hint: 


Every partition P = {t, . . . ,t,} of [a, 6] gives rise to a partition 
P’ = ἱ tec... ,t, +c} of [a +c, b+ cl], and conversely. 
Prove that 


a ὃ αὖ 
[ras f “a= | ede 
1 ¢ 1 ¢ 1 ¢ 


Hint: This can be written ἢ 1/tdt = is ae /tdt. Every partition 


P= {to, . . . , tn} of [1, a] gives rise to a partition P’ = {bfo, . . . , btn} 
of [b, ab], and conversely. 
Prove that 


[ “Κῶ dt = ὁ [. f(ct) dt. 
(Notice that Problem 12 is a special case.) 


Suppose that f is bounded on [a, ὁ] and that f is continuous at each point 
in [a, 6] with the exception of x, in (a, ὁ). Prove that f is integrable on 
[α, δ]. Hint: Imitate one of the examples in the text. 


Suppose that f is nondecreasing on [a, ὁ]. Notice that f is automatically 
bounded on [a, ὁ], because f(a) < f(x) < f(d) for x in [a, ὁ]. 


(a) If P= {to, ... ,¢t,} is a partition of [a, 6], what is L(f, δὴ) and 
U(f, P)? 

(Ὁ) Suppose that ἐ; — 4, = ὃ for each 7. Prove that U(f, P) — L(f, P) 
= ὃ [7(0) — fla)]. 

(c) Prove that f is integrable. 

(d) Give an example of a nondecreasing function on [0, 1] which is 
discontinuous at infinitely many points. 


total area ὁ. f—'(b) 


FIGURE 16 


Γ᾽) 
Γ΄ (ἢ 


Γι, 


Γι) 
Γ (ὦ) 


FIGURE 17 


Integrals 235 


It might be of interest to compare this problem with the following 
extract from Newton’s Principia.* 


LEMMA II 


If in any figure AacE, terminated by the right lines Aa, AE, and the curve 
ack, there be inscribed any number of parallelograms Ab, Bc, Cd, &c., com- 
prehended under equal bases AB, BC, CD, @c., and the sides, Bb, Cc, Dd, 
&c., parallel to one side Aa of the figure; and the parallelograms aKbl, 
bLem, cMdn, ©&c., are completed: then tf the breadth of those parallelograms 
be supposed to be diminished, and their number to be augmented in infinitum, 
I say, that the ultimate ratios which the inscribed figure AKbLcMdD, the 
circumscribed figure AalbmcndoE, and curvilinear figure Aabcdk, wll have 
to one another, are ratios of equality. 

For the difference of the inscribed and circumscribed figures is the 
sum of the parallelograms ΚΙ, Lm, Mn, Do, that is (from the equality 
of all their bases), the rectangle under one of their bases Kb and the 
sum of their altitudes Aa, that is, the rectangle ABla. But this rec- 
tangle, because its breadth AB is supposed diminished zn injfinztum, 
becomes less than any given space. And therefore (by Lem. 1) the 
figures inscribed and circumscribed become ultimately equal one to 
the other; and much more will the intermediate curvilinear figure be 
ultimately equal to either. Q.E.D. 


*16. Suppose that f is increasing. Figure 16 suggests that 
eee 2 es = ον 
ΓΞ πα πε Pe 
(a) If P= {t, ... ,t,} ἰ5 ἃ ραγιοη of [a, δ], let P’ = {f7'(to), . . . . 
f'n) }. Prove that, as suggested in Figure 17, 
ΤΡ PO) ap 0) aye) 


(b) Now prove the formula stated above. 
(c) Find [ Vx dx for 0 <a< b. 


17. A function s defined on [a, ὁ] is called a step function if there is a parti- 

tion P = {t, . . . , tn} of [a, δ] such that s is constant on each (ἔς, i) 

(the values of s at ¢; may be arbitrary). 

(a) Prove that if f is integrable on [a, 5], then for any € > 0 there is a 
step function “1 < f with Ι. ° ΕΞ i sy < &, and also a step function 
ὩΣ f with [" no [18 

(b) Suppose that for all ¢ > 0 there are step functions 51 < By and 
So > f such that [ [= Ἢ σι « ¢. Prove that f is integrable. 


* Newton’s Principia, A Revision of Mott’s Translation, by Florian Cajori. University of Cali- 
fornia Press, Berkeley, California, 1946. 


236 Derivatives and Integrals 


*18. 


19. 


20. 


1. 


ere 


(c) Find a function f which is not a step function, but which satisfies 
ic f = L(f, P) for some partition P of [a, 8]. 


Prove that if f is integrable on [a, ὁ], then for any € > 0 there is a con- 

. : ᾿ b δ ᾿ : 

tinuous function g < f with | } oh g <&. Hint: First get a step 
a a 

function with this property, and then a continuous one. A picture will 

help immensely. 


(a) Show that if 51 and s2 are step functions on [a, 4], then 5; + 52 is also. 
(b) Prove, without using Theorem 5, that [ (5, + 52) = [ 51 + [ 52. 


(c) Use part (b) (and Problem 17) to give an alternative proof of 
Theorem 5. 


Suppose that f is integrable on [a, 5]. Prove that there is a number «x in 
[a, ὁ] such that is “f= : f. Show by example that it is not always possi- 


ble to choose x to be in (a, 5). 


The purpose of this problem is to show that if f is integrable on [a, ὁ], 
then f must be continuous at many points in ἴα, ὁ]. 


(a) Let P= {to ...,tn} be a partition of [0,1] with U(f, P) -- 
Lif, P) < 6 — a. Prove that for some 7 we have M; — m; < 1. 

(Ὁ) Prove that there are numbers a; and 4; with a < a; < bi < 6 and 
sup { f(x): a1 <x < δι} — inf {[f(x~): a <x < δι) «1. (You can 
choose [ai, 61] = [#:—1, ¢;| from part (a) unlessz = 1 or n; and in these 
two cases a very simple device solves the problem.) 

(c) Prove that there are numbers a; and δὰ with a; < ἂς < bo « δὶ and 
sup { f(x): a2 < x < δ) — inf { f(x): ae < χ < be} « ἐξ. 

(4) Continue in this way to find a sequence of intervals I, = [an, bn] 
such that sup { f(x): x in J,} — inf { f(x): «in J,} < 1/n. Apply the 
Nested Intervals Theorem (Problem 8-14) to find a point x at which 
f is continuous. 

(e) Prove that f is continuous at infinitely many points in [a, 5]. 


Recall, from Problem 10, that [ f > Oif f(x) > 0 for all x in [a, 6]. 


(a) Give an example where f(x) > 0 for all x, and f(x) > 0 for some x in 
[a, bj, and iy = 0. 

(b) Suppose f(x) > 0 for all x in [a, 6] and f is continuous at xp in [a, ὁ] 
and f(xo) > 0. Prove that [ 710. Hint: It suffices to find one 
lower sum L{f, P) which is positive. 

(c) Suppose f is integrable on [a, 6] and f(x) > 0 for all x in [a, 6]. 
Prove that Ί: 7} 0. Hint: You will need Problem 21; indeed that 


was one reason for including Problem 21. 


29% 


24. 


5. 


*26. 


27): 


28. 


Integrals 237 


(a) Suppose that f is continuous on [a, 6] and 1. ° fg = 0 for all con- 


tinuous functions g on ἴα, ὁ]. Prove that f = 0. (This is easy; there 
is an obvious g to choose.) 


(Ὁ) Suppose f is continuous on [a, ὁ] and that Ι ᾿ fg = 0 for those con- 


tinuous functions g on [a, 6] which satisfy the extra conditions 
g(a) = g(b) = 0. Prove that f = 0. (This innocent looking fact is 
an important lemma in the calculus of variations; see the Suggested 
Reading for references.) Hint: Derive a contradiction from the 
assumption f(x) > Ὁ or f(xo) < 0; the g you pick will depend on 
the behavior of f near xp. 

Let f(x) = x for x rational and f(x) = 0 for x irrational. 


(a) Compute L(f, P) for all partitions P of [0, 1]. 

(b) Find inf {U(f, P): Ρ ἃ partition of [0, 1]}. 

Let f(x) = 0 for irrational x, and 1/q if x = p/q in lowest terms. Show 
that f is integrable on [0, 1] and that in f = 0. (Every lower sum is 


clearly 0; you must figure out how to make upper sums small.) 

Find two functions f and g which are integrable, but whose composition 
ge fis not. Hint: Problem 25 is relevant. 

(a) Prove that if m < f(x) < M for all x in [a, ὁ], then 


[0 f 0) ἃ = ὦ - ap 


for some number pw with m <u < M. 
(b) Prove that if f is continuous on [a, 6], then 


[1 dx = ὦ — Κῶ 


for some & in [a, 6]; and show by an example that continuity is 
essential. This fact is called the (First) Mean Value Theorem for 
Integrals. 

(a) Suppose that f is continuous on [a, 6] and that g is integrable and 
nonnegative on [a, 6|. Prove that 


[2 fd g(2) de = ΚΒ [ὦ as 


for some & in [a, δ]. This result is called the Second (or Generalized) 
Mean Value Theorem for Integrals. Hint: The inequality 
γι < f(x) < M implies that mg(x) < f(x) g(x) < Mag(x), since g 15 
nonnegative. 

(Ὁ) Show that the hypothesis for g is essential. (Try g(x) = x on [—1, 1].) 


The First and Second Mean Value Theorems for integrals will be used (in a 
not very essential way) in Chapter 19. The result of the next problem, which 
I have cribbed from several older calculus texts, is called the Third Mean 
Value Theorem for Integrals. To be quite frank, I haven’t the slightest idea 
what it is good for. 


238 


Derivatives and Integrals 


"29, 


*30. 


31, 


he 9 


(a) Suppose that g is increasing on [a, 6] and that f is integrable on 
[a, 6]. Prove that there is a number ἕξ in [a, 6] such that 


[P1Og@ αἱ = oa) [ἡ de + «(Ὁ [flO αι. 


Hint: What is the value of the right side for ἕξ = a and for ἕ = 6? 
(b) Show that the hypothesis on g is essential. (Try g(a) = g(6) = 0.) 
Let f be a bounded function on [a, ὁ] and let P be a partition of [a, 6].. 
Let M; and γι; have their usual meanings, and let M,’ and m,’ have the 
corresponding meanings for the function ἢ. 


(a) Prove that 


M,’ = max(|M,jj, |m\), 
m;’ = min(|Mj|, |m,|). 


(b) Prove that M,’ — γι, < M; — m;. Hint: You will have to consider 
two separate cases, because that’s how max and min are defined. 

(c) Prove that if f is integrable on [a, ὁ], then so is |f]. 

(4) Prove that if f and g are integrable on [a, δ], then so are max(f, g) 
and min(f, g). 

(e) Prove that f is integrable on [a, ὁ] if and only if its ‘‘positive part”’ 
max(f, 0) and its “negative part” min(f, 0) are integrable on [a, ὁ]. 


Prove that if f is integrable on [a, ὁ], then 
b b 
[Κὸ at] < [ἼΚΩῚ at. 


Hint: This follows easily from a certain string of inequalities; Problem 
1-14 is relevant. 

Suppose f and g are integrable on [a, ὁ] and f(x) > 0 for all x in [a, 6]. 
Let P be a partition of [a, ὁ]. Let M,’ and m,’ denote the appropriate 
sup’s and inf’s for f, define M,’”’ and σις" similarly for g, and define M; 
and m; similarly for fg. 


(a) Prove that M; < M,'M,” and m; > m,'m,”. 
(b) Show that 


σε, P) — L(fe, P) < ) [MiMi — m,'mi"\(ts — ἐν). 


(c) Using the fact that f and g are bounded, so that | f(x)|, |g()| < Μ 
for x in [a, 6], show that 
U(fg, P) — L(fg, P) 
<M { > [Mi — mh — ta) + Σ [My -- κι Ἰ( — t-1)}. 
‘21 


t=1 


(d) Prove that fg is integrable. 
(e) Now eliminate the restriction that f(x) > 0 for x in [a, ὁ]. 


33. 


134. 


Integrals 239 


Suppose that f and g are integrable on [a, ὁ]. The Cauchy-Schwartz 
inequality states that 


(fre) s (LF) ὁ) 


(a) Prove the Cauchy-Schwartz inequality by imitating one of the 
proofs of the Schwartz inequality in Problem 1-18. 

(b) If equality holds, is it necessarily true that f = Ag for some 4? What 
if f and g are continuous? 

(c) Show that the Schwartz inequality is a special case of the Cauchy- 
Schwartz inequality. 


(4) Prove that ( ᾿ }Ξ ( ie ft), Is this result true if 0 and 1 are 
replaced by a and ὁ 


Suppose that f is continuous and lim f(x) = a. Prove that 


27 ὦ 


ae | a ες: 


za xX JO 


Hint: The condition lim f(x) = a implies that f(t) is close to a for 


27% 


t > some N. This means that 7" f(t) dt is close to Ma. If M is large 
in comparison to N, then Ma/(N + M) is close to a. 


CHAPTER 


THEOREM 1 (THE FIRST 
FUNDAMENTAL THEOREM 
OF CALCULUS) 


PROOF 


THE FUNDAMENTAL THEOREM 
OF CALCULUS 7 


From the hints given in the previous chapter you may have already guessed 
the first theorem of this chapter. We know that if f is integrable, then F(x) = 


z . . ᾿ 2 
: f is continuous; it is only fitting that we ask what happens when the 


original function f is continuous. It turns out that F is differentiable (and its 
derivative is especially simple). 


Let f be integrable on [a, 6], and define F on [a, 6] by 


F(x) = i: ° Ve 
If f is continuous at ¢ in [a, δ], then F is differentiable at c, and 
B'(¢) = Κῶ. 


(If c = aor ὦ, then F’(c) is understood to mean the right- or left-hand deriva- 
tive of F.) 


We will assume that ς is in (a, δ); the easy modifications for c = aor ὦ may be 
supplied by the reader. By definition, 


F'(c) = lim nC raed Ἴ nm a ΤΩ 


h-0 h 
Suppose first that h > 0. Then 
cth 
Fe +h) -- FQ = [fF 
Define m, and M;, as follows (Figure 1): 


m, = inf {f(x~): c<«<et+ ἃ), 
M,, = sup {f(x):¢ <x <c +h}. 


It follows from Theorem 13-7 that 


moh < [7 2.3 Mach. 
Therefore 
Z F(e +h) ~ Fo) 


mn, SS h < M,. 
If h < 0, only a few details of the argument have to be changed. Let 


m = inf {f(x): c+th<x<ch, 
M, = sup {f(~): c th <x <c}. 


240 


The Fundamental Theorem of Calculus 241 


FIGURE 1 


Then 
my (—h) < ff < Ma: (--). 
Since 
cth c 
Fet+h)— FO - [7 1-- [᾿,} 
this yields 


Since ᾧ < 0, dividing by / reverses the inequality again, yielding the same 
result as before: 


2 Fite - ἃ) — Fle) τ 


mp SX ~ he 
h 


This inequality is true for any integrable function, continuous or not. Since 
f is continuous at c, however, 

lim m, = lim M, = f(c), 

h-0 h-0 
and this proves that 


F'(c) = lim 


h—0 


P(e + ᾿ — Fic) = fo). 4 


Although Theorem 1 deals only with the function obtained by varying the 
upper limit of integration, a simple trick shows what happens when the lower 
limit is varied. If G is defined by 


Ge) = [CF 


then 


σ0)- [1- [1 


242 Derivatives and Integrals 


COROLLARY 


PROOF 


Consequently, if f is continuous at c, then 


G'(c) = - ὦ. 


The minus sign appearing here is very fortunate, and allows us to extend 
Theorem 1 to the situation where the function 


Fx) = [1 
is defined even for x < a. In this case we can write 
F(x) = — 3} 


so if c < a we have 
Fc) = —(-#(0) = fo, 
exactly as before. 

Notice that in either case, differentiability of F at c is ensured by continuity 
of f at c alone. Nevertheless, Theorem 1 is most interesting when / is con- 
tinuous at all points in [a, δ]. In this case F is differentiable at all points in 
[a, δ] and 

F’ = f. 
In general, it is extremely difficult to decide whether a given function f is the 
derivative of some other function; for this reason Theorem 11-7 and Darboux’s 
Theorem (Problem 11-39) are particularly interesting, since they reveal cer- 
tain properties which f must have. If f is continuous, however, there is no 
problem at all—according to Theorem 1, f zs the derivative of some function, 
namely, the function 


F(x) = ΓΤ 


Theorem 1 has a simple corollary which frequently reduces computations 
of integrals to a triviality. 


If f is continuous on [a, ὁ] and f = g’ for some function g, then 


[! = g(b) — g(a). 


Let 
F(x) = [ἢ 

Then Ff’ = f = g’ on [a, δ]. Consequently, there is a number ¢ such that 
Fete. 


The number c can be evaluated easily: note that 


0 = Fla) = gla) +6, 
so c = —g(a); thus 
F(x) = g(x) — g(a). 


The Fundamental Theorem of Calculus 243 


This is true, in particular, for x = ὁ. Thus 
b 
[15 FO) = (δ) -- g@). I 


The proof of this corollary tends, at first sight, to make the corollary seem 
useless: after all, what good is it to know that 


[1-Ξ «(Ὁ -- κ(ῶ 


if g is, for example, g(x) = " ΠΡ The point, of course, is that one might happen 


to know a quite different function g with this property. For example, if 
x3 
a(x) = and f(x) = #', 


then g’(x) = f(x) so we obtain, without ever computing lower and upper 
sums: 
b 3 3 
[#ae=5-F 
a 3° 3 


One can treat other powers similarly; if n is a natural number and g(x) = 
x"t1/(n + 1), then g’(x) = x”, so 


b prt grtl 
[ax = = 
a n+1 n+ i 


For any natural number n, the function f(x) = x~” is not bounded on any 
interval containing 0, but if α and ὁ are both positive or both negative, then 


b parti gq nth 
[te-—-f τ 
a —n+1 —n+1 


Naturally this formula is only true for n # —1. We do not know a simple expression 


for 
b {| 
i — dx. 
a x 


The problem of computing this integral is discussed later, but it provides a 
good opportunity to warn against a serious error. The conclusion of Corollary 1 


is often confused with the definition of integrals—many students think that in f 


is defined as: “‘g(b) — g(a), where g is a function whose derivative is {ἡ This 
“definition” is not only wrong—it is useless. One reason is that a function f 
may be integrable without being the derivative of another function. For exam- 
ple, if f(x) = 0 for x ¥ 1 and f(1) = 1, then f is integrable, but f cannot be a 
derivative (why not?). There is also another reason that is much more impor- 
tant: If f is continuous, then we know that f = g’ for some function g; but we 
know this only because of Theorem 1. The function f(x) = 1/x provides an excel- 


244 Derivatives and Integrals 


THEOREM 2 (THE SECOND FUNDA- 
MENTAL THEOREM OF CALCULUS) 


PROOF 


lent illustration: if x > 0, then f(x) = σ΄ (χ), where 


11 
g(x) a | - dt, 
1 ¢ 


and we know of no simpler function g with this property. 

The corollary to Theorem 1 is so useful that it is frequently called the Sec- 
ond Fundamental Theorem of Calculus. In this book, that name is reserved for 
a somewhat stronger result (which in practice, however, is not much more 
useful). As we have just mentioned, a function f might be of the form g’ even if 
f is not continuous. If f is integrable, then it is still true that 


[1 -Ξ «(δ -- g(a). 


The proof, however, must be entirely different—we cannot use Theorem 1, so 
we must return to the definition of integrals. 


If f is integrable on [a, 6] and f = g’ for some function g, then 
δ 
[f= «() -- g@. 


Let P = {to, . . . , tn} be any partition of [a, δ]. By the Mean Value Theorem 
there is a point x; in [é;_1, ¢;] such that 


CG) =o Gey) Se OG ΞΞ te) 
= f(x) (4; το t1). 

If 
inf { f(x): tui <x < ἢ), 
sup {f(~): #1 Sx« < 4}, 


mM; 
M; 


then clearly 
m;(t; Ξ t23) < f(x) (t; 3 Hea) < M(t; " tt); 


that is, 
mit; — ti) S g(ti) — gin) S Mil — 41). 
Adding these equations for: = 1, . . . ,m we obtain 
> m(t; —- t1) < g(b) — gla) < > Mt tg) 
i=1 i=1 
so that 


L(f, P) < g(4) — κ(α) < UL, P) 
for every partition P. Sut this means that 
ὃ 
g(b) -- g(a) = [1 


We have already used the corollary to Theorem 1 (or, equivalently, 
Theorem 2) to find the integrals of a few elementary functions: 


oe ΓΕ ὑπο aon , (a and ὁ both positive or 
Ἧς; ᾿ γε τς ae ο both negative ifn < 0.) 


FIGURE 2 


FIGURE 3 


The Fundamental Theorem of Calculus 245 


As we pointed out in Chapter 13, this integral does not always represent the 
area bounded by the graph of the function, the horizontal axis, and the vertical 
lines through (a, 0) and (ὦ, 0). For example, if a < 0 < ὁ, then 


ΓΤ 8 de 


does not represent the area of the region shown in Figure 2, which is given 


instead by 
a* or - 0: 
ἃ 


0 b 
-([ κ᾽ dr) + {τ ἀν 
a 0 


it 

| 
κχ.«-- 
ἈΦ 

| 

| 


Ι 
| 
| 


Similar care must be exercised in finding the areas of regions which are 
bounded by the graphs of more than one function—a problem which may 
frequently involve considerable ingenuity in any case. Suppose, to take a sim- 
ple example first, that we wish to find the area of the region, shown in Figure 3, 
between the graphs of the functions 


f(x) =x? and g(x) = x 


on the interval [0, 1]. If 0 < « < 1, then 0 < x? < x?, so that the graph of g 
lies below that of f. The area of the region of interest to us is therefore 


area R(f, 0, 1) — area R(g, 0, 1), 
which 15 
fia [ἀν τπιλ- ἔτι ὺς 


This area could have been expressed as 


| Ge. 


Ifg(x) < f(x) for all xin [a, 5], then thisintegral always gives the area bounded by 


f and g, even if f and g are sometimes negative. The easiest way to see this is shown 
in Figure 4. If ¢ is a number such that f +c and g + ¢ are nonnegative on 
[a, 61], then the region R,, bounded by f and g, has the same area as the region 
R:, bounded by f - ¢ and g +c. Consequently, 

area Ri = area R, = [gta - [eto 


- [στὸ -- ᾳ Ὁ] 
Ξ [ (,-- 9). 


This observation is useful in the following problem: Find the area of the 
region bounded by the graphs of 


f(x) =x? —x and g(x) = x? 


246 Derivatives and Integrals 


The first necessity is to determine this region more precisely. The graphs of 
f and g intersect when 
Χ SS, 
or x? — x? — x = 0, 
or x(x? -—x— 1) = 0, 


ΟΣ 
Of. i πὰρ 


On the interval ({1 — ν 5] 72, 0) we have x? — x > x? and on the interval 


(0, [1 + ν΄5]72) we have x? > x? — x. These assertions are apparent from 
the graphs (Figure 5), but they can also be checked easily, as follows. Since 


f(x) = g(x) only if x = 0, [1 + W5]/2, or [1 — V5]/2, the function f — g 
does not change sign on the intervals ({1 — V5]/2, 0) and [0, [1 + ν΄ 5172); 


it is therefore only necessary to observe, for example, that 
(- ὃ πὸ - (“Bt = $> 0, 


ι5--- 1 --2- -- «0, 


to conclude that 


(b) 
FIGURE 4 f-g2>0 on(fl - V5)/2, 0), 


f-g<0 on(0, [1 + V5]/2). 
The area of the region in question is thus 


ἢ 1++/5 
ΤΣ OP = ae) ae a [! 7 fx? -- (x3 -- x)] de. 


2 


As this example reveals, one of the major problems involved in finding the 
areas of a region may be the exact determination of the region. There are, 
however, more substantial problems of a logical nature—we have thus far 
defined the areas of some very special regions only, which do not even include 
some of the regions whose areas have just been computed! We have simply 
assumed that area made sense for these regions, and that certain reasonable 
properties of “‘area’’ do hold. These remarks are not meant to suggest that you 
should regard exercising ingenuity to compute areas as beneath you, but are 
meant to indicate that a better approach to the definition of area is available, 
although its proper place is somewhere in advanced calculus. ‘The desire to 
define area was the motivation, both in this book and historically, for the 
definition of the integral, but the integral does not really provide the best 
method of defining areas, although it is frequently the proper tool for computing 
them. 

It may be discouraging to learn that integrals are not suitable for the very 
purpose for which they were invented, but we will soon see how essential they 
are for other purposes. The most important use of integrals has already been 


The Fundamental Theorem of Calculus 247 


FIGURE 5 


emphasized: if f is continuous, the integral provides a function y such that 


7 (x) = f(x). 


This equation is the simplest example of a “‘differential equation”’ (an equation 
for a function y which involves derivatives of y). The Fundamental Theorem 
of Calculus says that this differential equation has a solution, if f is continuous. 
In succeeding chapters, and in various problems, we will solve more compli- 
cated equations, but the solution almost always depends somehow on the 
integral; in order to solve a differential equation it is necessary to construct a 
new function, and the integral is one of the best ways of doing this. 

Since the differentiable functions provided by the Fundamental Theorem 
of Calculus will play such a prominent role in later work, it is very important 
to realize that these functions may be combined, like less esoteric functions, 
to yield still more functions, whose derivatives can be found by the Chain Rule. 

Suppose, for example, that 


x3 Ί 
ft) - eam cs 


Although the notation tends to disguise the fact somewhat, f is the composition 
of the functions 


ζ 1 
Cl ems 3 d Ε -- | RE Win ioe =p Pee ae pe dt 
le aA ae 
In fact, f(x) = F(C(x)); in other words, f = Fo C. Therefore, by the Chain 
Rule, 
7) = FUC&)) > C') 
= fr’ (x3) + 3x? 
1 


oe 5. 5: 5 τὶ A 2 
1 + sin? χϑ 


248 Derivatives and Integrals 


If f is defined, instead, as 


᾿ 1 
fe) = frais 


fi(x) = ΞΕ ΞᾺ 2χ3, 


1 + sin? χϑ 


then 


If f is defined as the reverse composition, 


1) = (fa) 
f(x) = C'(F(x)) + F(x) 


᾽ 1 2 1 
=3( | —— da) - -----------" 
α 1+ sin’? 1+ sin? x 


gin x 1 
- ΒΕ ΞΕΟ 
ite) I i + sin?? 
a 1 
= τας ἡ 
805) [ zl + sin” i 


haan (ff —*) 
ao 1+ sin’?t 


1 
1 + sin?(sin x) 


then 


Similarly, if 


then 


COS X, 


f'(«) = 


; —1 
x) = ------- ο-- ‘cos x, 
60) 1 + sin?(sin x) 


x 1 
h’(x) = cos (| a it) eee 
a 1 + sin?t 1+ sin? x 


The formidable appearing function 
, Ἧς: Ὁ) 1 P 
= ————- al 
7) 5 1 + sin?t 
is also a composition; in fact, f = Fo F. Therefore 


ΓΟ) = FR (x)) - FG) 
- 1 1 


a ee 
1 +sint( [ δι ἘΠ it) peas peas 
a 1+ sin’t 


As these examples reveal, the expression occurring above (or below) the 


The Fundamental Theorem of Calculus 249 


integral sign indicates the function which will appear on the right when f is 
written as a composition. As a final example, consider the triple compositions 


pany 


xs 


fla) = ram) oy ΓΙ : 


1+ sin??¢ 


cl εἰπε 


ed, 
1 + sin?¢ 
which can be written 


f=FeFoC and g=FoFoF, 


Omitting the intermediate steps (which you may supply, if you still feel 
insecure), we obtain 


1 1 
ey a Fa nae es 
x 1 1 + sin? x? 
1 + sin? 
a 1+ sin??¢ 
ie 1 1 
oe 7 Ci aan“) 1 oe “2 ° 1 dt 
1 + sin? "| =e ([ 2 ) 
+ n*| f . Saeaney 1 + sin? t 
1 
1 + sin? x 


Like the simpler differentiations of Chapter 10, these manipulations should 
become much easier after the practice provided by some of the problems, and, 
like the problems of Chapter 10, these differentiations are simply a test of your 
understanding of the Chain Rule, in the somewhat unfamiliar context pro- 
vided by the Fundamental Theorem of Calculus. 

The powerful uses to which the integral will be put in the following chapters 
all depend on the Fundamental Theorem of Calculus, yet the proof of that 
theorem was quite easy-—it seems that all the real work went into the definition 
of the integral. Actually, this is not quite true. In order to apply Theorem 1 to 
a continuous function we need the one theorem whose proof has not yet been 
given: If fis continuous on [a, 6], then f is integrable on [a, δ]. The proof which 
will be given depends upon a trick which is not very illuminating, but which 
at least serves to fill this gap in our knowledge. 

If f is any bounded function on [a, ὁ}, then 


sup {L(f, P)} and inf {U(f, P)} 


will both exist, even if f is not integrable. These numbers are called the lower 
integral of f on [a, 5] and the upper integral of f on [a, 5], respectively, and 
will be denoted by 


L Te f and U Ι f. 


The lower and upper integrals both have several properties which the integral 


250 Derivatives and Integrals 


THEOREM 13-3 


PROOF 


possesses. In particular, if a < ¢ < ὁ, then 
ΞΡ ἀπ ΠῚ ΞΡ ἘΠ: 
and if m < f(x) < M for all x in [a, 6], then 
mb—a) <Lf*f<u [fs MO -- ὦ. 


The proofs of these facts are left as an exercise, since they are quite similar to 
the corresponding proofs for integrals. The results for integrals are actually a 
corollary of the results for upper and lower integrals, because f is integrable 
precisely when 


b b 
Lff=Ufif. 
We will prove that a continuous function f is integrable by showing that this 


equality always holds for continuous functions. It is actually easier to show 
that 


Lfp=u fs 


for all x in [a, 5]; the trick is to note that most of the proof of Theorem 1 didn’t 
even depend on the fact that f was integrable! 


If f is continuous on [a, 6], then f is integrable on [a, 6]. 
Define functions 1, and U on [a, 6] by 
L(x) = ι, [7 and U(x) =U ie: 


Let x be in (a, 6). If h > Ὁ and 


m, = inf { f(t): «<t<x« +h}, 

Μ, = sup { f(t): x <t<x« +A}, 
then 

moh καὶ fp fsu [7 15 Mach, 
80 
moh < L(x +h) -- L(x) < U(x +h) -- U(x) < Mach 
or 
my LEED ABO σα τὰ = UE) ¢ uy, 

If h < 0 and 

m, = inf {f(t}: x HA <t < x}, 

M, = sup {f(t):x +A <t < x}, 


one obtains the same inequality, precisely as in the proof of Theorem 1. 
Since f is continuous at x, we have 


lim m, = lim M, = f(x), 


hk-0 h->0 


The Fundamental Theorem of Calculus 251 


and this proves that 
Le) SO ey) SH fa) for xin (ἃ, ὁ). 
This means that there is a number c such that 


[7(Χ) = LG) +e: ἴοσυ δ} χα ἴῃ [α, 6]. 


Since 
U(a) = Lia) = 0, 


the number ς must equal 0, so 


U(x) = L(x) for all x in ἴα, 6]. 
In particular, 


Ὁ = U(b) = L(b) = ι [1 


and this means that f is integrable on ἴα, ὁ]. ἢ 


PROBLEMS 


1. Find the derivatives of each of the following functions. 


3 


Gs FAS i sin? ¢ dt. 


— 1 


ἘΠ Perera 


oe Ξ y 1 
41) F(x) = a (f ἔπε ΠΝ Tig ἘΠῸ ΤΣ it) dy. 


ἢ 1 
5 eS a eo le 
Oe ae [ ee ἘΣ ἃ 


ὃ x 
Bes he ey 
ay ae i nT Oey ἢ 


ayes ( ἢ sae ( [ sin? t it) iy). 


= 
(vii) Ε΄, where F(x) = | , at 
1 


(ii) F(x) = 


(Find (F~')’(x) in terms of 
ee 1 F-X(x),) 

vill) /~!, where F(x) = | --  ---- dt. 

(vii @) = [| = 


2. For each of the following f, if F(x) = [ f, at which points x is F’(x) = 
f(x)? (Caution: it might happen that F’(x) = f(x), even if f is not 
continuous at χ.) 

(i) f(x) =Oifx <1, Ὁ) =1ifx> 


(ii) f(x) = οἵα «1, f@) =1 if x 
(iii) f(x) = Oifx #1, f(x) = 1 if x 


1 
ie 
1 


Il IV 


252 Derivatives and Integrals 


FIGURE 6 


(iv) f(x) = 0 if x is irrational, f(x) = 1/q if x = p/q in lowest terms. 
(v) f(x) =O ifx < 0, f(x) =x ifx = 0. 

(vi) f(x) = Oifx <0, f(x) = 1/[1/x] if x > 0. 

(vii) f is the function shown in Figure 6. 

(viii) f(x) = 1 if x = 1/n for some π in N, f(x) = 0 otherwise. 


3. Find (f7)’(0) if 
(i) f(x) = ie 1 + sin(sin ἐ) dt. 
(ii) f(x) = i, * sin(sin £) dt. 


(Don’t try to evaluate f explicitly.) 
4, Find F’(x) if F(x) = ͵ xf(t) dt. (The answer is not xf(x); you should 
perform an obvious manipulation on the integral before trying to find 1.) 
5. Prove that if f is continuous, then 


Jo fe) (x — τὴ du = [ ἐν 1) dt) du. 


Hint: Differentiate both sides, making use of Problem 4. 
**§, Use Problem 5 to prove that 


[Πῶς — wd = 2 Of (£0 4) dur) du 


7. Find a function f such that f’’”’(x) = 1/ V1 + sin? x. (This problem is 
supposed to be easy; don’t misinterpret the word ‘‘find.’’) 
*8. A function f is periodic, with period a, if f(x + a) = f(x) for all x. 


(a) If f is periodic with period a and integrable on [0, a], show that 


[2- ᾿ f for all ὁ. 


*10. 


11. 


ἜΠ 


15. 


*14, 


Lhe Fundamental Theorem of Calculus 253 


(b) Find a function f such that f is not periodic, but f’ is. Hint: Choose a 
periodic g for which it can be guaranteed that f(x) = Ἵ " g is not 
periodic. | 


(c) Suppose that /’ is periodic with period a. Prove that f is periodic 
if and only if f(a) = f(0). 


Find ΙΝ V x dx, by simply guessing a function f with f’(x) = Vv x, and 
using the Second Fundamental Theorem of Calculus. Then check with 
Problem 13-16. 


Use the Fundamental Theorem of Calculus and Problem 13-16 to 
derive the result stated in Problem 12-15. 

(a) Find the derivatives of F(x) = is 1/t dt and G(x) = i 1/t dt. 
(b) Now give a new proof for Problem 13-12. 


Use the Fundamental Theorem of Calculus and Darboux’s Theorem 
(Problem 11-39) to give another proof of the Intermediate Value 
Theorem. 
Prove that if A is continuous, f and g are differentiable, and 
g(z) 
Lay = με A(t) dt, 
then F’(x) = h(g(x)) " g’(x) — ACf(x)) - f(x). Hint: Try to reduce this 
to the two cases you can already handle, with a constant either as the 
lower or the upper limit of integration. 
(a) Suppose G’ = g and F’ = f. Prove that if the function y satisfies the 
differential equation 


(*) g(y(x)) - γ΄) = f(x) for all x in some interval, 
then there is a number c¢ such that 
(t#) G(y(x)) = F(x) + for all x in this interval. 


(b) Show, conversely, that if y satisfies (**), then y is a solution of (x), 
(c) Find what condition y must satisfy if 


(In this case g(t) = 1+ ¢ and f(t) = 1 + 42.) Then “solve” the 
resulting equations to find all possible solutions y (no solution will 
have R as its domain). 
(d) Find what condition y must satisfy if 
ἢ 
1 + 5[y(x)]* 
(An appeal to Problem 12-10 will show that there are functions 
satisfying the resulting equation.) 


γ΄.) = 


254 Derwatives and Integrals 


ap Bs 


*16. 


*17. 


(e) Find all functions y satisfying 


yxy (x) = x. 
Find the solution y satisfying y(0) = —1. 


The limit lim J," f, if it exists, is denoted by {7 (or [ f(x) dx), and 

called an “‘improper integral.” 

(a) Determine in xt dx, ifr < —1. 

(b) Use Problem 13-12 to show that i, ” 1/x dx does not exist. Hint: 
What can you say about i | jx an? 

(c) Suppose that f(x) > 0 for x > 0 and that if f exists. Prove that if 
0 < g(x) < f(x) for all x > 0, then i g also exists. 

(d) Explain why I, ” 1/(1 + x?) dx exists. Hint: Split this integral up at 1. 


The improper integral | "fis defined in the obvious way, as_ lim ie 7. 
τ Ν-» — ὦ 


But another kind of improper integral “Εἰς defined in a nonobvious 
prop § ΝΣ 


way: it 15 ip | ie [ an f, provided these improper integrals both exist. 


(a) Explain why | ae 1/(1 + x?) dx exists. 
(b) Explain why ] "x dx does not exist. (But notice that lim [ ἐν x dx 
Ἐν ana ee 


does exist.) 
(c) Prove that if. | ” f exists, then lim | f exists and equals i ieee 
ete Ν-- ὦ -Ν κω 


Show moreover, that lim | Bale f and lim | ᾿ f both exist and 
N- & aN —N? 


No 


equal J ΝᾺ f. Can you state a reasonable generalization of these 


facts? (If you can’t, you will have a miserable time trying to do these 
special cases!) 


There is another kind of “improper integral’? in which the interval is 
bounded, but the function is unbounded: 


(a) Ifa > 0, find lim ie ΛΜ dx. This limit is denoted by [1 ΖΝ x dx, 
even though the function f(x) = 1/ Vx is not bounded on [0, a], no 
matter how we define /(0). 

(b) Find [,° x" dx if -1 <r <0. 

(c) Use Problem 13-12 to show that ! * x1 dx does not make sense, even 
as a limit. 

(d) Invent a reasonable definition of [ |x|" dx for a < 0 and compute it 


for -—1<7r< 0. 


*18. 


The Fundamental Theorem of Calculus 255 


(e) Invent a reasonable definition of {, (1 — χ ΤΠ dx, as a sum 
of two limits, and show that the limits exists. Hint: Why does 
" (1 + x)7/2 dx exist? How does (1 + χ) "2 compare with 
(1 — ΧὩ ΤῊΣ for —1 <x < 0? 

It is possible, finally, to combine the two possible extensions of the notion 

of the integral. 


(a) If f(x) Ξ Μὰ for 0<x <1 and f(x) = 1/x? for x > 1, find 
i f(x) dx (after deciding what this should mean). 


(b) Show that i, x’ dx never makes sense. (Distinguish the cases 


—1 <r<Oandr < —1. In one case things go wrong at 0, in the 
other case at ©; for 7 = —1 things go wrong at both places.) 


IGURE 3 


CHAPTER 


THE TRIGONOMETRIC FUNCTIONS 


The definitions of the functions sin and cos are considerably more subtle than 
one might suspect. For this reason, this chapter begins with some informal and 
intuitive definitions, which should not be scrutinized too carefully, as they 
shall soon be replaced by the formal definitions which we really intend to use. 

In elementary geometry an angle is simply the union of two half-lines with 
a common initial point (Figure 1). 


πο πο 


FIGURE 1 


More useful for trigonometry are ‘‘directed angles,’ which may be regarded 
as pairs (/1, /2) of half-lines with the same initial point, visualized as in Figure 2. 


ly 


ty ly 
le le 


FIGURE 2 


If for /, we always choose the positive half of the horizontal axis, a directed 
angle is described completely by the second half-line (Figure 3). 

Since each half-line intersects the unit circle precisely once, a directed angle 
is described, even more simply, by a point on the unit circle (Figure 4), that is, 
by a point (x, y) with x? + y? = 1. 


256 τὰ 


FIGURE 4 


FIGURE 5 


FIGURE 6 


The Trigonometric Functions 257 


The sine and cosine of a directed angle can now be defined as follows 
(Figure 5): a directed angle is determined by a point (x, y) with x* + y’ = 1; 
the sine of the angle is defined as y, and the cosine as x. 

Despite the aura of precision surrounding the previous paragraph, we are 
not yet finished with the definitions of sin and cos. Indeed, we have barely 
begun. What we have defined is the sine and cosine of a directed angle; what 
we want to define is sin x and cos x for each number x. The usual procedure for 
doing this depends on associating an angle to every number. [he oldest 
method is to “‘measure angles in degrees.” An angle “‘all the way around”? is 
associated to 360, an angle “‘half-way around”’ is associated to 180, an angle 
(ἃ quarter way around” to 90, etc. (Figure 6). The angle associated, in this 
manner, to the number «x, is called ‘‘the angle of x degrees.” The angle of 
0 degrees is the same as the angle of 360 degrees, and this ambiguity is pur- 
posely extended further, so that an angle of 90 degrees is also an angle of 
360 + 90 degrees, etc. One can now define a function, which we will denote 
by sin’, as follows: 


sin°(x) = sine of the angle of x degrees. 


There are two difficulties with this approach. Although it may be clear 
what we mean by an angle of 90 or 45 degrees, it is not quite clear what an 


angle of V2 degrees is, for example. Even if this difficulty could be circum- 
vented, it is unlikely that this system, depending as it does on the arbitrary 
choice of 360, will lead to elegant results—it would be sheer luck if the func- 
tion sin° had mathematically pleasing properties. 

‘Radian measure” appears to offer a remedy for both these defects. Given 
any number x, choose a point P on the unit circle such that x is the length of 
the arc of the circle beginning at (1, 0) and running counterclockwise to P 
(Figure 7). The directed angle determined by P is called “τῆς angle of x 
radians.”’ Since the length of the whole circle is 27, the angle of x radians and 
the angle of 27 + x radians are identical. A function sin’ can now be defined 
as follows: 

sin’(x) = sine of the angle of x radians. 


This same method can easily be adopted to define sin®; since we want to 
have sin® 360 = sin” 27, we can define 


: . 2πχ TX 
sin° x = sin”? —— 7 


= ς ----.-.. 
360 180 
We shall soon drop the superscript r in sin’, since sin” (and not sin”) is the 
only function which will interest us; before we do, a few words of warning are 
advisable. 
The expressions sin® x and sin” x are sometimes written 
sin x° 
sin x radians, 


but this notation is quite misleading; a number x is simply a number—it does 


258 Derwwatives and Integrals 


FIGURE 7 


FIGURE 8 


DEFINITION 


FIGURE 9 


not carry a banner indicating that it is ‘tin degrees” or ‘‘in radians.” If the 
meaning of the notation “sin x” is in doubt one usually asks: 


“15. x in degrees or radians ?”’ 
but what one means is: 
‘Do you mean ‘sin® or ‘sin” ?” 


Even for mathematicians, addicted to precision, these remarks might be 
dispensable, were it not for the fact that failure to take them into account will 
lead to incorrect answers to certain problems (an example is given in Problem 
18). 

Although the function sin’ is the function which we wish to denote simply 
by sin (and use exclusively henceforth), there is a difficulty involved even in 
the definition of sin’. Our proposed definition depends on the concept of 
length of a curve. Although we could digress, and define length, at the present 
time it is much easier to reformulate the definition in terms of areas, which we 
can treat by means of the integral. (A treatment in terms of length is outlined 
in Problems 32 to 34.) 

Suppose that x is the length of the arc of the unit circle from (1, 0) to P; this 
arc thus contains x/27 of the total length 27 of the circumference of the unit 
circle. Let § denote the “sector”? shown in Figure 8; S is bounded by the unit 
circle, the horizontal axis, and the half-line through (0, 0) and P. The area of 
S should be x«/27 times the area inside the unit circle, which we expect to be 7; 
thus S' should have area 

x x 

ἘΞ ὁ : 

2π 2 
We can therefore define cos x and sin x as the coordinates of the point P which 
determines a sector of area x/2. 

With these remarks as background, the rigorous definition of the functions 
sin and cos now begins. The first definition identifies 7 as the area of the unit 
circle—more precisely, as twice the area of a semicircle (Figure 9). 


«-- 2: [ἡ V1 -- αὐ αν. 


(This definition is not offered simply as an embellishment; to define the 
trigonometric functions it will be necessary to first define sin x and cos x only 
for 0 <x <7.) 

The second definition is meant to describe, for —1 < x < 1, the area A(x) 
of the sector bounded by the unit circle, the horizontal axis, and the half-line 
through (x, Vi x?) If0 < x < 1, this area can be expressed (Figure 10) as 
the sum of the area of a triangle and the area of a region under the unit circle: 


V eo gee 1 
avis | V1 — # dt. 


FIGURE 10 


FIGURE 11 


FIGURE 12 


(x, v1 — x?) 


DEFINITION 


DEFINITION 


The Trigonometric Functions 259 


This same formula happens to work for —1 < x < 0 also. In this case (Figure 


11), the term 
xV1— χϑ 
Z 


is negative, and represents the area of the triangle which must be subtracted 


from the term 
L VMieta 
ie) = ee tea 


V ἫΝ 1 


a 


Notice that if —1 < x < 1, then A is differentiable at x and (using the 
Fundamental Theorem of Calculus), 


A’ (x) Gree + vi=#|- ψ 1 πὰ" 


2 ie ee 


[1 —x? + (1 — x?) = me 
Ξ᾿ Ὁ ταὶ vos 


1 — 2x? 
"Tr, ar ahaa 
an 2e = ὑπ... 
7 2 ΝΊ -- x? 
ει ἘΣ 
ae 
Notice 4180 (Figure 12) that on the interval [—1, 1] the function A decreases 
from 


1 
A(-1) =0 + [ Vi = Pdi => 
= 


to A(1) = 0. This follows directly from the definition of A, and also from the 
fact that its derivative is negative on (—1, 1). 
For 0 < x < 7 we wish to define cos x and sin x as the coordinates of a point 
= (cos x, sin x) on the unit circle which determines a sector whose area 1s 
x/2 (Figure 13). In other words: 


If 0 < x <7, then cos x is the unique number in [—1, 1] such that 


A(cos x) = 53 


and 


sin x = V1 — (cos x)?. 


260 Derivatives and Integrals 


THEOREM 1 


PROOF 


P = (cos x, sin x) 


FIGURE 13 


This definition actually requires a few words of justification. In order to 
know that there zs a number y satisfying A(y) = x/2, we use the fact that A is 
continuous, and that A takes on the values 0 and 7/2. This tacit appeal to the 
Intermediate Value Theorem is crucial, if we want to make our preliminary 
definition precise. Having made, and justified, our definition, we can now 
proceed quite rapidly. 


If0 <x <7, then 
cos’(x) = — sin x, 
sin’(x) = cos x. 
If B = 24, then the definition A(cos x) = x/2 can be written 


B(cos x) 


x; 
in other words, cos is just the inverse of B. We have already computed that 


1 


CO —  ———— 
(x) ey gear 


) 


from which we conclude that 


1 


RT eee, 


Consequently, 
(B™)"(x) 
᾿ 1 
Β΄ B(x) 
1 
πες ae 
V1 — [B-(x)]? 
BV = (cos x)? 


= — sin x. 


Ι 


cos’ (x) 


I 


Since 

sinx = V1 — (cos x)2, 
we also obtain 
1. —2 cos x + cos’(x) 
2 V1 -- (cos x)? 


cos x sin x 


sin’(x) = 


sin x 


cos x. ἢ 


The information contained in Theorem 1 can be used to sketch the graphs 


FIGURE 14 


ἐξ 
2 
5 


FIGURE 1 


FIGURE 16 


sin 


The Trigonometric Functions 261 


of sin and cos on the interval [0, a]. Since 
cos’(x) = — sinx <0, 0 «χ <7, 


the function cos decreases from cos 0 = 1 tocos 7 = —1 (Figure 14). Conse- 
quently, cosy = 0 for a unique γ in [0, 7]. To find y, we note that the definition 
of cos, 


Xx 
A O = -) 
(cos x) 

means that 
y 
A(QO) ==) 
(0) Σ 


50 
γΞ 2 [[ νἱ -- δά 
It is easy to see that 
[ V1 —Pd= [evi — 22 dt, 
=I 0 
50 we can also write 


1 ae 
τ νΊ -- (αι =~ 
«ἢ 2 


Now we have 
>0, O0O<x<7/2 


BUA τος 
sin’ (x) cosx | 2) ieee. 


so sin increases on [0, 1/2] from sin 0 = 0 tosin 7/2 = 1, and then decreases 
on [7/2, 7] to sin = Ο (Figure 15). 


The values of sin x and cos x for x not in [0, 7] are most easily defined by a 
two-step piecing together process: 


(1) Ifw<x« < 2m, then 


sin x = — sin(27 — x), 
cos x = cos(2m7 — x). 


Figure 16 shows the graphs of sin and cos on [0, 27]. 


(2) If x = 2rk + x’ for some integer ἀ, and some x’ in [0, 27], then 


sin x = sin x’, 
cos x = COs x’, 


Figure 17 shows the graphs of sin and cos, now defined on all of R. 


Having extended the functions sin and cos to R, we must now check that 
the basic properties of these functions continue to hold. In most cases this is 
easy. For example, it is clear that the equation 


sin? x + cos? x = 1 


262 Derivatives and Integrals 


(b) 


FIGURE 17 


holds for all x. It is also not hard to prove that 


sin’(x) = cos x, 
cos’(x) = — sin x, 


if x is not a multiple of 7. For example, if π < x < 27, then 


sin x = — sin(2r — x), 
50 
sin’(x) = — sin’(2m -- χ) - (-- 1) 
= ςοβ(2π — x) 
= COS x. 


If x is a multiple of 7 we resort to a trick; it is only necessary to apply 
Theorem 11-7 to conclude that the same formulas are true in this case also. 


FIGURE 18 


FIGURE 


19 


(b) 
FIGURE 


20 


THEOREM 2 


PROOF 


The Trigonometric Functions 263 


The other standard trigonometric functions present no difficulty at all. We 
define 


1 
sec xX = 
cos x 
. x τέ ἐπ + 3/2, 
sin x 
tan x = 
cos x 
1 
csc x = — 
sin x 
ΧΕ hI: 
cos x 
cot x = — 
sin x 


The graphs are sketched in Figure 18. It is a good idea to convince yourself 
that the general features of these graphs can be predicted from the derivatives 
of these functions, which are listed in the next theorem (there is no need to 
memorize the statement of the theorem, since the results can be rederived 
whenever needed.) 


If x τέ ἀπ + 7/2, then 


sec’/(x) = sec x tan x, 
tan’(x) = sec? x. 

If x # ἀπ, then 
csc’(x) = — csc x cot x, 
cot’(x) = — csc? x. 


Left to you (a straightforward computation). ἢ 


The inverses of the trigonometric functions are also easily differentiated. 
The trigonometric functions are not one-one, so it is first necessary to restrict 
them to suitable intervals; the largest possible length obtainable is 7, and the 
intervals usually chosen are (Figure 19) 


[--π 2, 2/2] for sin, 
[0, πὶ for cos, 
(—7/2, 7/2) for tan. 


(The inverses of the other trigonometric functions are so rarely used that they 
will not even be discussed here.) 
The inverse of the function 


f(x) =sinx, -7/2<*< 7/2 


is denoted by arcsin (Figure 20); the domain of arcsin is [—1, 1]. The 

notation sin~! has been avoided because arcsin is not the inverse of sin 

(which is not one-one), but of the restricted function f; the notation sin * has 

the additional disadvantage that sin7!(x) might be construed as 1/sin x. 
The inverse of the function 


g(x) =cosx, O<x*<7 


264 Derivatives and Integrals 


FIGURE 21 


THEOREM 3 
FIGURE 23 


PROOF 


is denoted by arccos (Figure 21); the domain of arccos is [—1, 11. 
The inverse of the function 


A(x) = tanx, —m7/2<x< 7/2 


is denoted by arctan (Figure 22); arctan is one of the simplest examples of a 
differentiable function which is bounded even though it is one-one on all of R. 
The derivatives of the inverse trigonometric functions are surprisingly sim- 
ple, and do not involve trigonometric functions at all. Finding the derivatives 
is a simple matter, but to express them in a suitable form we will have to sim- 
plify expressions like 
cos(arcsin x), 
sec(arctan x). 


FIGURE 22 


A little picture is the best way to remember the correct simplifications. For 
example, Figure 23 shows a directed angle whose sine is x—the angle shown is 
thus an angle of (arcsin x) radians; consequently cos(arcsin x) is the length of 
the other side, namely, νΊ — x2. However, in the proof of the next theorem 
we wili not resort to such pictures. 


If —1 <x < 1, then 


) 1 

arcsin’(x) = ----- - 

V1 — x? 
arccos’(x) = σξξε- 

-χ 
Moreover, for all x we have 
1 

arctan’(x) = 

1 ΞΡ ἃ" 


arcsin’(x) = (f—*)’(x) 
1 


Γ (x)) 
| 1 

- sin’(arcsin x) 
τς τ 
cos(arcsin x) 


The Trigonometric Functions 265 


Now | 
[sin(arcsin x) |? + [cos(arcsin x)]? = 1, 
that is, 
x? + [cos(arcsin x)]? = 1; 
therefore, 


cos(arcsin x) = V1 — x2. 


(The positive square root is to be taken because arcsin x is in (—7/2, 1/2), so 
cos(arcsin x) > 0.) This proves the first formula. 

The second formula has already been established (in the proof of 
Theorem 1). It is also possible to imitate the proof for the first formula, a 
valuable exercise if that proof presented any difficulties. The third formula 
is proved as follows. 

arctan’(x) = (h71)’(x) 
_ 1 
αἰ Ἰ0)}) 
> 1 
. tan’(arctan x) 
1 
sec?(arctan x) 
Dividing both sides of the identity 
sin? a + cos*a = 1 
by cos? a yields 
tan? a + 1 = sec’ a. 
It follows that 
[tan(arctan x)|? + 1 = sec?(arctan ~), 


or 
x? + 1 = sec?(arctan x), 


which proves the third formula. J 
The traditional proof of the formula sin’(x) = cos x (quite different from 


the one given here) is outlined in Problem 26. This proof depends upon first 
establishing the limit 


and the ‘“‘addition formula’’ 
sin(x + y) = sin x cos y + cos x sin y. 


Both of these formulas can be derived easily now that the derivative of sin and 
cos are known. The first is just the special case sin’(0) = cos 0. The second 
depends on a beautiful characterization of the functions sin and cos. In order 
to derive this result we need a lemma whose proof involves a clever trick; a 
more straightforward proof will be supplied in Part IV. | 


266 Derivatives and Integrals 


LEMMA Suppose f has a second derivative everywhere and that 


f" +f=0, 
f(0) = 0, 
ΚΟ) = 0. 


Then f = 0. 


PROOF Multiplying both sides of the first equation by /’ yields 
ff! + ff = 0. 
(SP EE = 207 iy) = 0, 


so (f’)? + f? is a constant function. From f(0) = 0 and f’(0) = 0 it follows 
that the constant is 0; thus 


(f(x) ]? + [f(x)]? = 0 for all x. 


Thus 


This implies that 
f(x) = 0 for all x. J 


THEOREM 4 If f has a second derivative everywhere and 


fo = 0; 
f(0) = a, 
f'(0) = 4, 


then 

f = δ᾽ sin + a‘ cos. 
(In particular, if f(0) = 0 and /’(0) = 1, then f = sin; if f(0) = 1 and 
f'(0) = 0, then f = cos.) 


PROOF [εἰ 
g(x) = f(x) — ὁ sin x — acos x. 


Then 
o'(x) = f(x) — bcosx +a sin x, 
g(x) = f’ (x) + bsin x + ἃ cos x. 
Consequently, 
e+e = 0, 
g(0) = 0, 
4΄(0) = 0, 


which shows that 
0 = g(x) = f(x) — bsinx —acosx, forall χ. ἢ 
THEOREM 5 If x and y are any two numbers, then 


sin(x + y) = sin x cos y + cos x sin y, 
cos(x + y) = cos x cos y — sin x sin y. 


PROOF 


The Trigonometric Functions 267 


For any particular number y we can define a function f by 


f(x) = sin + 9). 


Then 
f(x) = cos(x + 9), 
f'(x) = -- sin(x + y). 
Consequently, 
f+ f= 0, 
ΚΟ) = sin y, 
f’'(0) = cos y. 


It follows from Theorem 4 that 


f = (cos y) «sin + (sin y) - cos; 
that 1s, ; 
sin(x + y) = cosy sin x + sin y cos x, for all x. 


Since any number y could have been chosen to begin with, this proves the first 
formula for all x and y. 
The second formula is proved similarly. J 


As a conclusion to this chapter, and as a prelude to Chapter 17, we will 
mention an alternative approach to the definition of the function sin. Since 
ts 1 
arcsin’(x) = —=———== for -—1<x*< 1, 


V1 — x? 
it follows from the Second Fundamental Theorem of Calculus that 
1 


zx 
arcsin x = arcsin x — arcsin 0 = i as 
0V1—-# 


dt. 


This equation could have been taken as the definition of arcsin. It would follow 
immediately that 


1 
arcsin’ (x) => Sra 
1—t 


the function sin could then be defined as (arcsin)~! and the formula for the 
derivative of an inverse function would show that 


sin’(x) = V1 sin? x, 


which could be defined as cos x. Eventually, one could show that A(cos x)= 
x/2, recovering at the very end of the development the definition with which 
we started. While much of this presentation would proceed more rapidly, the 
definition would be utterly unmotivated; the reasonableness of the definitions 
would be known to the author, but not to the student, for whom it was 
intended! Nevertheless, as we shall see in Chapter 17, an approach of this sort 
is sometimes very reasonable indeed. 


268 Derivatives and Integrals 


PROBLEMS 


1. 


Differentiate each of the following functions. 
(i) 2.) = arctan(arctan(arctan x)). 

(ii) f(x) = arcsin(arctan(arccos x)). 

(iii) f(x) = arctan(tan x arctan x). 


(iv) f(x) = arcsin (sa) 


Find the following limits by l’Hépital’s Rule. 


sin x — x + x3/6 


(i) lim ; 

z—>0 x 
wy a. SINK — αὶ + x3/6 
“- πε νυ 
ιν, COSH — ἴ + x?/2 
Thi eR. 

~ ne x" 


: . cosx — 1+ x?/2 
ae aa 


arctan x — x + x3/3 


(v) lim 
z—0 x* 
(vi) lim (: -- Ε. ) 
“- 0 ἂχ sin αὶ 
sin x 
Lee) = |e » x #0 
1, x = 0. 


(a) Find f’(0). 
(Ὁ) Find Κ΄). 


At this point, you will almost certainly have to use ’Hépital’s Rule, but 
in Chapter 23 we will be able to find f (0) for all &, with almost no work 
at all. 


Graph the following functions. 


(a) f(x) = sin 2x. 

(b) f(x) = sin(x?). (A pretty respectable sketch of this graph can be 
obtained using only a picture of the graph of sin. Indeed, pure 
thought is your only hope in this problem, because determining the 
sign of the derivative f’(x) = cos(x?) - 2x is no easier than deter- 
mining the behavior of f directly. The formula for f’(x) does indicate 
One important fact, however—/f’(0) = 0, which must be true since f 
is even, and which should be clear in your graph.) 

(c) f(x) = sin x + sin 2x. (It will probably be instructive to first draw 
the graphs of g(x) = sin x and A(x) = sin 2x carefully on the same 


The Trigonometric Functions 269 


set of axes, from 0 to 27, and guess what the sum will look like. You 
can easily find out how many critical points f has on [0, 27] by con- 
sidering the derivative of f. You can then determine the nature of 
these critical points by finding out the sign of f at each point; your 
sketch will probably suggest the answer.) 

(d) f(x) = tanx — x. (First determine the behavior of fin (—71/2, 7/2); 
in the intervals (km — 7/2, ἐπ + 7/2) the graph of f will look 
exactly the same, except moved up a certain amount. Why?) 

(e) f(x) = sin x — x. (The material in the Appendix to Chapter 11 will 
be particularly helpful for this function.) 

sin x | 
(f) f(x) = | x 
i x = 0. 


x #0 


(Part (d) should enable you to determine approximately where the 
zeros of f’ are located. Notice that f is even and continuous at 0; also 
consider the size of f for large x.) 


Prove the addition formula for cos. 


(a) From the addition formula for sin and cos derive formulas for sin 2x, 
cos 2x, sin 3x, and cos 3x. 

(b) Use these formulas to find the following values of the trigonometric 
functions (usually deduced by geometric arguments in elementary 


trigonometry): 7 
ὩΣ π V2 
sin- = cos- = ) 
4 4 2 
tan <= 1, 
4 
. π 1 
510 -- ΞΞ -, 
6 2 
V3 
cos - = ——" 
6 2 


(a) Show that A sin(x + B) can be written as asin x + cos x for suit- 
able a and ὁ. (One of the theorems in this chapter provides a one-line» 
proof. You should also be able to figure out what a and ὁ are.) 

(Ὁ) Conversely, given a and ὁ, find numbers A and B such that a sin x 
+ b6cosx = Asin(x + B) for all x. 

(c) Use part (b) to graph f(x) = V3 sin x + cos x. 

(a) Prove that 

ἐπ ού υῦτες tan x + tan y 

1 — tan x tan y 

provided that x, y, and x + y are not of the form km + 7/2. (Use the 

addition formulas for sin and cos.) 


270 Derivatives and Integrals 


10. 


11. 


12. 


(b) Prove that 


=+2) 


arctan x + arctan y = arctan ( 
1:Ξ ay 


indicating any necessary restrictions on x and y. Hint: Replace x by 
arctan x and y by arctan y in part (a). 
Prove that 


arcsin a + arcsin β = arcsin(a’V 1 —-~+8 Vie a’), 


indicating any restrictions on ἃ and β. 

Prove that if m and πὶ are any numbers, then 
sin mx sin nx = $[cos (m — n)x — cos (m + n)x], 
sin mx cos nx = o[sin (m + n)x + sin (m — π)χ], 
cos mx cos nx = 3[cos (m + n)x + cos (m — n)x]. 


Prove that if m and n are natural numbers, then 


πς : 0, msn 
sin mx sin nx dx = 

-πτ Tr, m=nN, 

7 0, msn 
cos mx cos nx dx = 

=e Tr, m=n, 


τ . 
[ sin mx cos nx dx = 0. 
στ 


These relations are particularly important in the theory of Fourier 
series. Although this topic will receive serious attention only in the Sug- 
gested Reading, the next problem provides a hint as to their importance. 
(a) If fis integrable on [—7, 7], show that the minimum value of 


me (f(x) — a cos nx)? dx 


occurs when 


1 Pe 
a=- | f(x) cos nx dx, 
τ —~ 
and the minimum value of 


1: (f(x) — asin nx)? dx 


when 
a= : [ f(x) sin nx dx. 
ΤΌ. -πτ 


(In each case, bring a outside the integral sign, obtaining a quad- 
ratic expression in a.) 


(b) Define 
An aie f(x) cos nx dx, n=0,1,2,..., 
τ — 
b, = = [Κῶ sin nx dx, n = 1, 2, 3, 
π 


The Trigonometric Functions 271 


Show that if c; and d; are any numbers, then 


N | 
[ (400 - E Ἔ ) Cn cos nx + dy, sin nx|) dx 
. 1 


= [σῶμ ax -- 2 (5 - Σ ἄκος + badn) + (= + Σ cat + da?) 
= [νῶραὰ -τ (“= + y an? + bn?) 
r((.- 4) + y (cn — an)? + (dn -- δ) 


thus showing that the first integral is smallest when a; = ¢; and 
b; = d;. In other words, among all “linear combinations” of the 
functions s,(x) = sin nx and c,(x) = cosnx for 1<n< WN, the 
particular function 


N 
g(x) = A + > dn (08 nx + by sin nx 
n=1 


has the ‘“‘closest fit” to f on [—7, 7]. 
13. Find a formula for sin x + sin y and for cos x + cos y. Hint: First finda 
formula for sin(a + 6) + sin(a — 6). What good does that do? 
14, (a) oe from the formula for cos 2x, derive formulas for sin? x and 
cos? x in terms of cos 2x. 
(b) Prove that 


x ΕΣ ΕΣ ᾿ς ees 
cos — = ~4/————-. and__ sin - = 4/-——--—- 
2 Z 2 2 


for0 <x < 7/2. 
(c) Use part (a) to find [ sin? x dx and fe cos? x dx. 


15. Find sin(arctan x) and cos(arctan x) as expressions not involving 
trigonometric functions. Hint: y = arctan x means that x = tan y = 
sin y/cos y = sin y/V1 — sin? y. 

16. Ifx = tan u/2, express sin u and cos u in terms of x. (Use Problem 15; 
the answers should be very simple expressicns. ) 

17. (a) Prove that sin(x + 7/2) = cos x. (All along we have been drawing 

the graphs of sin and cos as if this were the case.) 

(b) What is arcsin(cos x) and arccos(sin x)? 


1 
18. (a) Find I 
0 


dt. Hint: The answer is not 45. 


Lee ee 


272 Derivatives and Integrals 


19. 


20. 


21. 


22. 


23. 


24. 


125, 


FIGURE 24 


26. 


(b) Find [ὦ 
o 1+ 42 


Find lim » sin ει 


xr οὦ x 

(a) Define functions sin° and cos°® by sin®°(x) = sin(rx/180) and 
cos’*(x) = cos(7x/180). Find (sin°)’ and (cos°)’ in terms of these 
same functions. 
sin® x 


(b) Find lim and lim x sin® τ 


.90 Xx xr 0 x 

Prove that every point on the unit circle is of the form (cos 9, sin 0) for 

at least one (and hence for infinitely many) numbers @. 

(a) Prove that 7 is the maximum possible length of an interval on which 
sin is one-one, and that such an interval must be of the form 
(2kr -- 2/2, 2kw + 1/2). 

(b) Suppose we let g(x) = sin x for x in (2km — 2/2, 2km + 7/2). What 
is 5 1)’ ? 

Let f(x) = sec x for 0 < x <7. Find the domain of f~! and sketch its 

graph. 

Prove that jsin x — sin y| < |x — y| for all numbers x and γ. Hint: The 

same statement, with < replaced by <, is a very straightforward conse- 

quence of a well-known theorem; simple supplementary considerations 
then allow < to be improved to <. 

It is an excellent test of intuition to predict the value of 


lim [” f(x) sin Ax dx. 

AH wo 96 
Continuous functions should be most accessible to intuition, but once 
you get the right idea for a proof the limit can easily be established for 
any integrable ἢ. 
(a) Show that lim : sin \x dx = 0, by computing the integral 


A 0 
explicitly. 
(b) Show that if sis a step function on [a, 6] (terminology from Problem 


13-17), then lim " s(x) sin Ax dx = 0. 
A> 0 a 


(c) Finally, use Problem 13-17 to show that lim ! : f(x) sin Ax dx = 0 


λ- 00 


for any function f which is integrable on [a, 6]. 


This result, like Problem 11, plays an important role in the theory of 
Fourier series; it is known as the Riemann-Lebesgue Lemma. 


This problem outlines the classical approach to the trigonometric func- 
tions. ‘The shaded sector in Figure 24 has area x/2. 


127, 


28. 


The Trigonometric Functions 273 


(a) By considering the triangles OAB and OCB prove that if 0 <x < 
w/4, then 


sin x x sin x 


2 2 2cosx 


(b) Conclude that 


and prove that 
. sin x 
lim = 1. 
“-οῇῷὸ Xx 


(c) Use this limit to find 


. 1-—cosx 
lim. —————-: 
z—0 x 
(d) Using parts (b) and (c), and the addition formula for sin, find 
sin’(x), starting from the definition of the derivative. 


Yet another development of the trigonometric functions was briefly 
mentioned in the text—starting with inverse functions defined by inte- 
grals, It is convenient to begin with arctan, since this function is defined 
for all x. To do this problem, pretend that you have never heard of the 
trigonometric functions. 


(a) Let a(x) = ie (1 + 42): dt. Prove that @ is odd and increasing, 
and that lim a(x) and lim a(x) both exist, and are negctives of 


Ζ1-.- ὦ στο 


each other. If we define 7 = 2 lim a(x), then α΄ is defined on 


(—7/2, 7/2). 

(Ὁ) Show that (a7!)’(x) = 1 + [α 1(Χ}]3. 

(c) For x = ἀπ + x’ with x’ ¥ 7/2 or —7/2, define tan x = a7 1(x’). 
Then define cos x = 1/V 1 + tan? x, for x not of the form ἀπ + 7/2 
or ἀπ — 1/2, and cos (kr + 7/2) = 0. Prove first that cos’(x) = 
— tan x cos x, and then that cos’’(x) = — cos x for all x. 


If we are willing to assume that certain differential equations have solu- 
tions, another approach to the trigonometric functions is possible. Sup- 
pose, in particular, that there is some function yo which is not always 0 
and which satisfies yo’ + yo = 0 


(a) Prove that yo? + (yo’)? is constant, and conclude that either yo(0) τέ 0 
or yo (0) # 0. 

(b) Prove that there is a function s satisfying σ΄" + s = 0 and s(0) = 0 
and s’(0) = 1. Hint: Try s of the form ayo + byo’. 


274 Derivatives and Integrals 


29. 


*30. 


If we define sin = s and cos = σ΄, then almost all facts about 
trigonometric functions become trivial. There is one point which 
requires work, however—producing the number z. This is most 
easily done using an exercise from the Appendix to Chapter 11: 


(c 


ἝΝ 


Use Problem 6 of the Appendix to Chapter 11 to prove that cos x 
cannot be positive for all x > 0. It follows that there is a smallest 
xo > O with cos x» = 0, and we can define 7 = 2x». 

Prove that sin 7/2 = 1. (Since sin? + cos? = 1, we have sin 7/2 = 

+1; the problem is to decide why sin 7/2 is positive.) 

(e) Find cos 7, sin 7, cos 27, and sin 27, (Naturally you may use any 
addition formulas, since these can be derived once we know that 
sin’ = cos and cos’ = — sin.) 

(f) Prove that cos and sin are periodic with period 27. 


(d 


"μ΄ 


(a) After all the work involved in the definition of sin, it would be dis- 


concerting to find that sin is actually a rational function. Prove that 
it isn’t. (There is a simple property of sin which a rational function 
cannot possible have.) : 

(b) Prove that sin isn’t even defined implicitly by an algebraic equation; 
that is, there do not exist rational functions fo, . . . , fra—1 such that 


(sin x)" + fr—i(x)(sin «)"7! + + + + + μα) = 0 for all x. 


Hint: Prove that fp = 0, so that sin x can be factored out. The 
remaining factor is 0 except perhaps at multiples of 27. But this 
implies that it is 0 for all x. (Why?) You are now set up for a proof 
by induction. 


Suppose that ¢: and @» satisfy 


oi” + 2191 = 0, 
2” + £202 = 0, 
and that go > σι. 


(a) Show that 
φι΄ φ: — φι΄“φι — (σὰ — £1) bide = 0. 
(Ὁ) Show that if ¢1(x) > 0 and φε(χ) > 0 for all x in (a, δ), then 


b 
[? (ev'be — o2"’61] > 0, 
and conclude that 
[φι΄ (ὁ) φ(ό) — φι΄ (α) φε(α)]}[Φτ(ὁ) b2'(b) — φι(α) b2'(a)] > 0. 
(c) Show that in this case we cannot have φι(α) = φι(θ) = 0. Hint: 
Consider the sign of @;’(a) and ¢$1'(d). 
(d) Show that the equations ¢:(2) = φι(θ) = 0 are also impossible if 


6: > 0, de «0 or gi < 0, 2 > 0, or φι: < 0, φ: «0 on (α, J). 
(You should be able to do this with almost no extra work.) 


The Trigonometric Functions 275 


The net result of this problem may be stated as follows: if a and ὦ are 
consecutive zeros of ¢;, then @2 must have a zero somewhere between 
a and ὁ. This result, in a slightly more general form, is known as the 
Sturm Comparison Theorem. As a particular example, any solution 
of the differential equation - 


y+ wet ly =0 


must have zeros on the positive horizontal axis which are within 7 of 
each other. 


31. Prove that if sin x/2 5 0, then 
1 


Pane 
2 


At this stage of the game, this problem is a mathematical idiot’s delight. 
A proof by induction depends on establishing the formula 
sin(n -+ 4)x ee ee sin(n + 3)x 


oda 9 sin = 
2 2 


This will require repeated use of addition formulas, in just the right 
places, but very little thinking. A more reasonable derivation of the 
formula will occur in Problem 26-13. Like two other results in this prob- 
lem set, this equation is very important in the study of Fourier series, 
and we also make use of it in Problems 18-28 and 22-17. 


The remaining problems in this chapter form a brief introduction to the 
topic of arc length, and show how trigonometric functions can be defined in 
terms of arc length. 


*32. Let f be a continuous function on [a, ὁ]. If P = {to, . . . , én} isa parti- 
tion of [a, ὁ], define 

Σ να τι ΡῈ ΤΩ — Κα ἢ] 

p= 


1 τῷ 


(f, P) = 


The number ¢(f, P) represents the length of a polygonal curve inscribed 
in the graph of f (see Figure 25). We define the length of f on [a, ὁ] to 
be the least upper bound of all £(f, P) for all partitions P (provided that 
the set of all such (f, P) is bounded above). 


(a) If fis a linear function on [a, 5], prove that the length of f is the dis- 
tance from (a, f(a)) to (6, f(4)). 

(b) If f is not linear, prove that there is a partition P = {a, t, δὲ οἵ 
[a, ὁ] such that ¢(f, P) is greater than the distance from (a, f(@)) to 
(ὁ, f(6)). (You will need Problem 4-8.) 


276 Derivatives end Integrals 


*33. 


*34. 


FIGURE 25 


(c) Conclude that of all functions f on [a, 6] with f(a) = ὁ and f(b) = d, 
the length of the linear function is less than the length of any other. 
(Or, in conventional but hopelessly muddled terminology: “A 
straight line is the shortest distance between two points.’’) 

(a) Suppose that f’ is bounded on [a, 4]. If P is any partition of [a, 5] 
show that 


L(V 1+ 7) P) < ef, P) Ξ σι - (f')4, P). 
Hint: Use the Mean Value Theorem. 
(Ὁ) Why is sup {L(V'1 + (7), P)} < sup {€(f, P)}? (This is easy.) 
(c) Now show that sup {{(7, P)} < inf {U(W1 + (0) P)}, thereby 
proving that the length of f on [a, ὁ] is [ Vit (f’)?, if Ay ii {7} 
is integrable on [a, 4]. Hint: It suffices to show that if P’ and P” are 
any two partitions, then ¢(f, P’) < U(VW1 + (f')?, P’). If P con- 
tains the points of both P’ and P’’, how does ¢(f, P’) compare to 
t(f, P)? 
Let f(x) = V1 — x? for —1 <x <1. Define £&(x) to be the length of 
fon [x, 1]. 
(a) Show that 


oe 


1 1 
26) = i Tot 


(This is actually an improper integral, as defined in Problem 14-17.) 
(b) Show that 


1 
(Δ) = — -------- for -1 <x <1. 
V1 — x? 
(c) Define 7 as £(—1). For 0 < x < 7, define cos x by &(cos x) = x, 
and define sin x = V1 — cos? x. Prove that cos’(x) = —sin « and 


sin’(x) = cos x for 0 < x <7. 


*CHAPTER 


TIS IRRATIONAL 


This short chapter, diverging from the main stream of the book, is included to 
demonstrate that we are already in a position to do some sophisticated mathe- 
matics. This entire chapter is devoted to an elementary proof that 7 is irra- 
tional. Like many “elementary” proofs of deep theorems, the motivation for 
many steps in our proof cannot be supplied; nevertheless, it is still quite possi- 
ble to follow the proof step-by-step. 


Two observations must be made before the proof. The first concerns the 
function 


x"(1 — x)” 


n! 


frlx) = 
which clearly satisfies 


1 
0< μι) <— for 0 <x« <1. 
n! 


An important property of the function f, is revealed by considering the expres- 
sion obtained by actually multiplying out «"(1 — x)”. The lowest power of x 
appearing will be n and the highest power will be 2”. Thus f, can be written 
in the form 


where the numbers ¢; are integers. It is clear from this expression that 


fra (0) = 0 ifk <nork> 2n. 


Moreover, 


1 ᾿ 
fa (x) = = [n! cy + terms involving x] 
πὶ 


fre) = = [(n + 1)! cnyi1 + terms involving x| 
n!} 


fa (2) = = [2n)! can] 


277 


278 Derivatives and Integrals 


This means that 


fa (0) = Cn, 
fat? (0) = (n + Leng, 


fne”(0) = (2n)(2n — 1): . 2. - nt 1)eon, 
where the numbers on the right are all integers. Thus 


fn™(O) is an integer for all k. 
The relation 


f(x) = fa(l — x) 
fo (x) = (ἡ) ( -- x); 


implies that 


therefore, 
fn“ (1) is also an integer for all k. 


The proof that 7 is irrational requires one further observation: if a is any 
number, and ¢ > 0, then for sufficiently large n we will have 


n 


a 
— < ς. 
n! 


To prove this, notice that if n > 2a, then 


ἜΝ eae 
(n-+1)! n+i1 n! 2 n! 


Now let no be any natural number with no > 2a. Then, whatever value 


no 


a 
(no)! 
may have, the succeeding values satisfy 
qth) 1 q”™ 
St < πο ἢ 
(no +1)! 2 (ng)! 
giro?) 1 α(911) 1 1 αἴ 
a < sn. TO = a τς < — * ee 
(no +2)! 2 (mp +1)! 2 2 (mp)! 
gq mt) 1 αἴ. 
1Ξ:.3...1.3......5...15 2... ς < ρος | . 
(no +k)! 25 (no)! 
If κα is so large that -— « 2*, then 


(no) le 


THEOREM 1 


PROOF 


a 1s Irrational 279 


gtk) 
------- SE 
(no + k)! , 


which is the desired result. Having made these observations, we are ready 
for the one theorem in this chapter. 


The number 7 is irrational; in fact, 7? is irrational. (Notice that the irration- 
ality of πϑ implies the irrationality of 7, for if π᾿ were rational, then 7? certainly 
would be.) 


Suppose 7? were rational, so that 


T* = 


m1 a& 


for some positive integers a and ὁ. Let 


(1) G(x) = δ΄ [πΞ,(.) -- πῆ Ὁ, (x) τ τ 4f, (9 (x) 
πον $f (= 1) "fn 90}. 
Notice that each of the factors 


a n 


—k 
ὑπ πη 2k ΩΣ δ5.(π2)» = ῥ᾽" (<) ἘΞ αν * pk 


is an integer. Since fn, (0) and f,“(1) are integers, this shows that 
G(0) and G(1) are integers. 
Differentiating G twice yields 
ΟΞ ΞΟ ee OL 
The last term, (—1)"f,2"* (x), is zero. Thus, adding (1) and (2) gives 
(3) G(x) + WG) = δ" πὴ Ff, (x) = πϑα 0 1). 


Now let 
H(x) = G'(x) sin mx — wG(x) cos 7x. 
Then 
H' (x) = wG’(x) cos wx + G’’(x) sin mx — 7G’ (x) cos mx + 7°G(x) sin πα 


[G’’ (x) + w?G(x)] sin ax 
Ta"fa(x) sin wx, by (3). 


By the Second Fundamental Theorem of Calculus, 


T? [ a”f,(x) sin πα dx = H(1) -- H(0) 


= Ο' (1) sina — tG(1) cos7 — G’(0) sin0 + 7G(0) cos 0 
m[G(1) + G(0)]. 


Thus 
1 : : 
π a” f,(x) sin wx dx is an integer. 


280 Derivatives and Integrals 


On the other hand, 0 < f,(x) < 1/n! for 0 < x < 1, so 


0 < πα" f,(x) sin wx < a for0 <x <1. 
n! 
Consequently, 


1 Ta” 
0 « r | a” fn(x) sin πα dx < πο 
0 n! 


This reasoning was completely independent of the value of n. Now if n is large 
enough, then 


} Ta” 
o<n f a” fn(x) sin wx dx < —- < 1. 
0 n! 


But this is absurd, because the integral is an integer, and there is no integer 
between 0 and 1. Thus our original assumption must have been incorrect: 
mT’ is irrational. ἢ 


This proof is admittedly mysterious; perhaps most mysterious of all is the 
way that 7 enters the proof—it almost looks as if we have proved 7 irrational 
without ever mentioning a definition of 7. A close reexamination of the proof 
wil show that precisely one property of 7 is essential— 


sin(7r) = 0. 


The proof really depends on the properties of the function sin, and proves the 
irrationality of the smallest positive number x with sin x = 0. In fact, very few 
properties of sin are required, namely, 


sin’ = cos, 

cos’ = — sin, 
sin(0) = 0, 
cos(0) = 1. 


Even this list could be shortened; as far as the proof is concerned, cos might 
just as well be defined as sin’. The properties of sin required in the proof may 
then be written 


sin(0) = 


sin’’ + sin = 0, 
sin’(0) = 1. 


Of course, this is not really very surprising at all, since, as we have seen in the 
previous chapter, these properties characterize the function sin completely. 


PROBLEMS 


1. (a) Prove that the areas of triangles OAB and OAC in Figure 1 are related 
by the equation 


FIGURE 1 


(b) 


(a) 


(b) 


(c) 


π 1s Irrational 281 


aA στ OB 

mime = ees 
2 2 

Hint: Solve the equations xy = 2(area OAB), x? + y? = 1, for y. 


Let P,, be the regular polygon of m sides inscribed in the unit circle. 
If A,, is the area of P,, show that 


Aim = V2 —2V1 = (ΔΑ, πὴ" 


This result allows one to obtain (more and more complicated) expres- 
sions for A», starting with A, = 2, and thus to compute 7 as accu- 
rately as desired (according to Problem 8-11). Although better meth- 
ods will appear in Chapter 19, a slight variant of this approach yields 
a very interesting expression for 7: 


Using the fact that 
area(OAB) _ 
area(OAC) 


show that if a, is the distance from O to one side of Pm, then 


An 
Aom ee 
Show that 
ϑ 
ce = Og’ Mg* . . . ° Mok, 
Using the fact that 


7 
Qm = COS — 
m 


and the formula cos x/2 = pees (Problem 15-14), prove that 
πὰ J 
4 2 
He 1 ἢ 
ὁ NTS Vi 


a 
ΤΣ 


282 Derivatives and Integrals 


Together with part (b), this shows that 2/z can be written as an 
“infinite product” 


ns 1.1. ft i 1 ἢ 1 [1 
wn  N2 2° OND a5 75 ee 


to be precise, this equation means that the product of the first n fac- 
tors can be made as close to 2/m as desired, by choosing n sufficiently | 
large. This product was discovered by Francois Viete in 1579, and is 
only one of many fascinating expressions for 7, some of which are 
mentioned later. 


CHAPTER 


THE LOGARITHM AND 
EXPONENTIAL FUNCTIONS 


In Chapter 15 the integral provided a rigorous formulation for a preliminary 
definition of the functions sin and cos. In this chapter the integral plays a more 
essential role. For certain functions even a preliminary definition presents 
difficulties. For example, consider the function 


f(x) = 107. 


This function is assumed to be defined for all x and to have an inverse function, 
defined for positive x, which is the “logarithm to the base 10,” 


f(x) = logio x. 
In algebra, 105 is usually defined only for rational x, while the definition for 
irrational x is quietly ignored. A brief review of the definition for rational x 
will not only explain this omission, but also recall an important principle 
behind the definition of 10”. 

The symbol 10” is first defined for natural numbers n. This notation turns 
out to be extremely convenient, especially for multiplying very large numbers, 
because 

10° 10 16". 
The extension of the definition of 105 to rational x is motivated by the desire to 
preserve this equation; this requirement actually forces upon us the customary 
definition. Since we want the equation 


10°- 107 = 10°" = 10” 
to be true, we must define 10° = 1; since we want the equation 
107-7 - 10" = 10° = 1 


to be true, we must define 107” = 1/10”; since we want the equation 


10u™. | νον 19U" = 1ONet +e = 101 = 10 
ee el ti aii aati 
n times n times 


to be true, we must define 10"/" = Vv 10; and since we want the equation 


10U™- 2. 10" = 10 πΈ tm = 10%" 
etl eee” 
m times m times 


to be true, we must define 1055 = (V'10)™. 

Unfortunately, at this point the program comes to a dead halt. We have 
been guided by the principle that 10° should be defined so as to ensure that 
107+¥ = 10710%; but this principle does not suggest any simple algebraic way 


of defining 10° for irrational x. For this reason we will try some more sophisti- 


283 


284 Derivatives and Integrals 


cated ways of finding a function f such that 


(*) f(x + y) = f(x) + fly) for all x and y. 


Of course, we are interested in a function which is not always zero, so we might 
add the condition f(1) # 0. If we add the more specific condition f(1) = 
then (*) will imply that f(x) = 10? for rational x, and 105 could be defined as 
f(x) for other x; in general f(x) will equal [f(1)]* for rational «x. 

One way to find such a function is suggested if we try to solve an apparently 
more difficult problem: find a differentiable function f such that 


f(x + y) = f(x): f(y) for all x and y, 
fl) = 


Assuming that such a function exists, we can try to find f/—knowing the deriv- 
ative of f might provide a clue to the definition of f itself. Now 


f(x Ὁ ἃ) -- f@) 


Γ (Δ) = lim " 
— tin LO LM = FO) 
h-+0 A 
= f(x) iim Ao! 
h-0 
The answer thus depends on 
h-0 h 


for the moment assume this limit exists, and denote it by a. Then 
f(x) =a: f(x) for all x. 


Even if α could be computed, this approach seems self-defeating. The deriva- 
tive of f has been expressed in terms of f again. 
If we examine the inverse function fi= logio, the whole situation appears 
in a new light: 
1 


PF )) 
Ss ped at ta ges 
a ΓΑ) ax 


The derivative of ΚΓ] is about as simple as one could ask! And, what is even 
b 


logio' (x) = 


more interesting, ofall the integrals | x” dx examined previously, the integral 


a 


b 
[ x~' dx is the only one which we cannot evaluate. Since logio 1 = 0 we 
a 


should have 


1.751 
“| - dt = logiox — logio 1 = logio x. 
aJjil 


DEFINITION 


THEOREM 1 


PROOF 


The Logarithm and Exponential Functions 285 


This suggests that we define logiy x as (1/a) | [1 dt. The difficulty is that 
1 


a is unknown. One way of evading this difficulty is to define 


log x = | Ϊ 4, 
1 ¢ 


and hope that this integral will be the logarithm to some base, which might be 
determined later. In any case, the function defined in this way is surely more 
reasonable, from a mathematical point of view, than logio. The usefulness 
of logio depends on the important role of the number 10 in arabic notation 
(and thus ultimately on the fact that we have ten fingers), while the function 
log provides a notation for an extremely simple integral which cannot be 
evaluated in terms of any functions already known to us. 


If x > 0, then 


log x = [14 
1 ¢ 


The graph of log is shown in Figure 1. Notice that if x > 1, then log x > 0, 
and if 0 < x < 1, then log x < 0, since, by our conventions, 


z 1 
[14--- [jaso. 
1 ἐ z ft 


For x < 0, a number log x cannot be defined in this way, because f(t) = 1/¢ is 
not bounded on [x, 1]. 


area = log ἃ 


(a) (b) 
FIGURE 1 


The justification for the notation ‘‘log’’ comes from the following theorem. 


li, y >: then 
log(xy) = log x + log y. 


Notice first that log’(x) = 1/x, by the Fundamental Theorem of Calculus. 
Now choose a number y > 0 and let 


f(x) = log(xy). 


286 Derivatives and Integrals 


COROLLARY 1 


PROOF 


COROLLARY 2 


PROOF 


‘Then 


; 1 1 

f(x) = log’Gy) -y = —-y = = 
xy x 

Thus f’ = log’. This means that there is a number c such that 


f(x) = logx+c for all x > 0, 
that is, 
log(xy) = log x +c for all x > 0. 


The number c can be evaluated by noting that when x = 1 we obtain 


log(1-y) = log 1 +c 
= 6. 
Thus 
log(xy) = log x + log y for all x. 


Since this is true for all y > 0, the theorem is proved. J 


If n is a natural number and x > 0, then 


log(x") = n log x. 
Left to you (use induction). J 


If x, y > 0, then 
log (2) = log x — log y. 
J 


This follows from the equations 


log x = log ( 5) = log (*) + log y. ff 
Jy y 


Theorem 1 provides some important information about the graph of log. 
The function log is clearly increasing, but since log’(x) = 1/x, the derivative 
becomes very small as x becomes large, and log consequently grows more and 
more slowly. It is not immediately clear whether log is bounded or unbounded 
on R. Observe, however, that for a natural number 2, 


log(2”) = nlog2 (and log 2 > 0); 
it follows that log is, in fact, not bounded above. Similarly, 
1 
log (=) = log 1 — log 2” = —n log 2; 


therefore log is not bounded below on (0, 1). Since log is continuous, it actually 
takes on all values. Therefore R is the domain of the function log™’. ‘This 
important function has a special name, whose appropriateness will soon be- 
come clear. 


DEFINITION 


THEOREM 2 


PROOF 


THEOREM 3 


PROOF 


FIGURE 2 


DEFINITION 


The Logarithm and Exponential Functions 287 


The “exponential function,” exp, is defined as log™’. 


The graph of exp is shown in Figure 2. Since log x is defined only for x > 0, 
we always have exp(x) > 0. The derivative of the function exp is easy to 
determine. 


For all numbers x, : 
exp’(x) = exp(x). 
1 


eM ee) elo) 


1 
55: οὐ 
log™*(x) 
log” 1(x) = exp(x). ἢ 


I 


A second important property of exp is an easy consequence of ‘Theorem 1. 


If x and y are any two numbers, then 
exp(x + y) = exp(x) " exp(y). 
Let x’ = exp(x) and y’ = exp(y), so that 


x = log x’, 
y = logy’. 
Then 
x+y = log x’ + log y’ = log(x’y’). 


This means that 
exp(x + y) = χγ' = exp(x) - exp(y). I 
This theorem, and the discussion at the beginning of this chapter, suggest 


that exp(1) is particularly important. There is, in fact, a special symbol for 
this number. 


| e = exp(1). | 


This definition is equivalent to the equation 


ΕἼ 
1 πος 4 -- [ae 
1 ¢ 


288 Derwwatives and Integrals 


As illustrated in Figure 3, 


5: : : 
i ae <1, since 1 - (2 — 1) is an upper sum for 
1 


, 
fin = 
Cees f(t) = 1/t on [1, 2], 
and 
44 ᾿ 
[pat since #: (2 — 1) +4- (4 — 2) = 1 is a lower 
ἘΔ rns -. ι ¢ 
2 3 4 


- sum for f(t) = 1/¢ on [1, 4]. 


1 
Thus 


FIGURE 3 2 Ρ 4 
[tac flac [ba 
ι ἐ i¢ 1. ἢ 


which shows that 
2<e< 4, 


In Chapter 19 we will find much better approximations for 6, and also prove 
that ¢ is irrational (the proof is much easier than the proof that 7 is irrational!). 
As we remarked at the beginning of the chapter, the equation 


exp(x + y) = exp(x) - exp(y) 
implies that 
exp(x) = [exp(1)}’ 
= φῇ, for all rational x. 


Since exp is defined for all x and exp(x) = é for rational x, it is consistent with 
our earlier use of the exponential notation to define e” as exp(x) for all x. 


DEFINITION For any number x, 


e* = exp(x). 


The terminology ‘“‘exponential function” should now be clear. We have 
succeeded in defining 67 for an arbitrary (even irrational) exponent x. We have 
not yet defined a’, if a ¥ e, but there is a reasonable principle to guide us in 
the attempt. If x is rational, then 


αὖ -Ξ (ele αγ5 _ er log 


But the last expression is defined for all x, so we can use it to define a’. 


DEFINITION If a > 0, then, for any real number x, 
a = eles, 


(If a = e this definition clearly agrees with the previous one.) 


The requirement a > 0 is necessary, in order that log a be defined. This is 


FIGURE 4 


FIGURE 5 


-ς. flx) = logge (α «1) 


THEOREM 4 


PROOF 


ares 


The Logarithm and Exponential Functions 289 


not unduly restrictive since, for example, we would not even expect 
{-Ξ 1} BV -1 
to be defined. (Of course, for certain rational x, the symbol a* will make sense, 
according to the old definition; for example, 
(-1)8 = W-1 = -1,) 
Our definition of αἴ was designed to ensure that 
(e*)¥ = οἷν for all x and y. 


As we would hope, this equation turns out to be true when e is replaced by any 
number a > 0. The proof is a moderately involved unraveling of terminology 
At the same time we will prove the other important properties of a’ 


If a > 0, then 
(1) (αὖ). = a for all ὁ, 6. 


(Notice that αὐ will automatically be positive, so (a’)* will be defined); 

(2) αἱ =a and a = αὐ -α" for all x, y. 
(Notice that (2) implies that this definition of αὐ agrees with the old one for 
rational x.) 

(1) (a’)° ΞΞ of log αϑ = of log (eblog a) __ e° (b log α) = ecb loga _ a’. 

ach of the steps in this string of equalities depends upon our last ae nition, 

(Each of th ps in thi ing of equalities depends up last definiti 
or the fact that exp = log” 1.) 

(2) qi = piloget — gioga = a 


grty = ettwloge — ptlogatyloga — gtloga. pylog? = Q®- a’. fj 


Figure 4 shows the graphs of f(x) = a” for several different a. The behavior 
of the function depends on whether a < 1, Ξ 1, ora> 1. If a = 1, then 
f(x) = 17 = 1. Suppose a > 1. In this case log a > 0. Thus, 


if x<y, 

then x loga < y log a, 
sO er ingtige eV loge, 
1.6., a® <a". 


Thus the function f(x) = αὖ is increasing. On the other hand, if0 <a < 1, so 
that log a < 0, the same sort of reasoning shows that the function f(x) = a” is 
decreasing. In either case, ifa > 0 anda κέ 1, then f(x) = a is one-one. Since 
exp takes on every positive value it is also easy to see that αὖ takes on every 
positive value. Thus the inverse function is defined for all positive numbers, 
and takes on all values. If f(x) = a’, then ΚΓ is the function usually denoted 
by log, (Figure 5). 


290 Derivatives and Integrals 


Just as a® can be expressed in terms of exp, so log, can be expressed in terms 
of log. Indeed, 


if y = log, x, 
then Χ Ξε Ql = gviege 
SO log x = y log a, 
or = log x 
log a 
In other words, 
] 
log, x = tad 
log a 


The derivatives of f(x) = αἴ and g(x) = log, x are both easy to find: 


f(x) = 8% 90 f(x) = log a: @'s4 = log α΄ a’, 
log x 
g(x) = 5 
og a 


» 80 g(x) = Fie : 


A more complicated function like 
f(x) = g(x) 
is also easy to differentiate, if you remember that, by definition, 
f(x) = el (2) log σία). 


it follows from the Chain Rule that 


F(a) = aermere> La") Jog els) + ate) £2] 


om i 


= σα). ie log g(x) + A(x) 


There is no point in remembering this formula—simply apply the principle 
behind it in any specific case that arises; it does help, however, to remember 
that the first factor in the derivative will be g(x)*@. 

There is one special case of the above formula which ἐς worth remembering. 
The function f(x) = x* was previously defined only for rational a. We can now 
define and find the derivative of the function f(x) = x* for any number ga; the 
result is just what we would expect: 


f(x) = x9 = et lon? 
50 


f(x) see etlog= = =e x* = ax}, 
x x 


Algebraic manipulations with the exponential functions will become second 
nature after a little practice—just remember that all the rules which ought to 
work actually do. The basic properties of exp are still those stated in Theorems 
2 and 3: 


THEOREM 5 


PROOF 


The Logarithm and Exponential Functions 291 


exp’(x) = exp(x), 
exp(x + y) = exp(x) " exp(y). 
In fact, each of these properties comes close to characterizing the function exp. 
Naturally, exp is not the only function / satisfying f' = f, for if f = ce”, then 
ΚΟ) = ce® = f(x); these functions are the only ones with this property, 
however. 


If f is differentiable and 
f'(x) = f(x) for all x, 
then there is a number c¢ such that 


f(x) = ce* for all x. 
Let 
g(x) = fe 


(This is permissible, since δῇ κέ 0 for all x.) Then 


gy = CL) — Ke _ 9 
g(x) ee ees 


Therefore there is a number c such that 


fle) _ 


=->— =c forall x.J 
e 


g(x) 


The second basic property of exp requires a more involved discussion. The 
function exp is clearly not the only function f which satisfies 


f(x +9) = 10.) f). 


In fact, f(x) = 0 or any function of the form f(x) = a” also satisfies this equa- 
tion. But the true story is much more complex than this—there are infinitely 
many other functions which satisfy this property, but it is impossible, without 
appealing to more advanced mathematics, to prove that there is even one 
function other than those already mentioned! It is for this reason that the 
definition of 105 is so difficult: there are infinitely many functions f which 
satisfy 
fle +») = 20) -710), 
f(1) = 10, 


but which are not the function f(x) = 10°! One thing is true however—any 
continuous function f satisfying | 


7 + y) = 70) :10) 
must be of the form f(x) = αὖ or f(x) = 0. (Problem 27 indicates the way to 


i 


292 Derivatives and Integrals 


THEOREM 6 


PROOF 


FIGURE 6 


prove this, and also has a few words to say about discontinuous functions with 
this property.) 

In addition to the two basic properties stated in Theorems 2 and 3, the 
function exp has one further property which is very important—exp “grows 
faster than any polynomial.” In other words, 


For any natural number n, 


ev 
lim — = o, 


The proof consists of several steps. 
Step 1. e* > χ for all x, and consequently lim 67 = o (this may be con- 


sidered to be the case n = 0). 
To prove this statement (which is clear for x < 0) it suffices to show that 


x > log x for all «> 0. 


If'x < 1 this is clearly true, since log x < 0. If x > 1, then (Figure 6) x — 1 is 
an upper sum for f(t) = 1/t on [1, x], sologx <x -- 1 « χ. 


Step 2. lim es ©, 


roe X 


To prove this, note that 
εξ 6512 . erl2 1 ρἕ 2 


DOT & 


By Step 1, the expression in parentheses is greater than 1, and lim 9713 = o; 


z— © 


this shows that lim é*/x = ο. 


zw @ 


Step 3, lim il = οὐ, 
zo XV 


Note that 


--- on tt —- . 


x” (*)" eee Ὁ τ x 
— "Nn — 
n n 


The expression in parentheses becomes arbitrarily large, by Step 2, so the nth 
power certainly becomes arbitrarily large. ἢ 


It is now possible to examine carefully the following very interesting func- 
tion: f(x) = e~/*" x χε 0. We have 


f(x) = eu. = 
x 


The Logarithm and Exponential Functions 293 


Therefore, 
f'(x) <0 for x < 0, 
f(x) >0 for x > 0, 


so f is decreasing for negative x and increasing for positive x. Moreover, if |x| is 
large, then x? is large, so —1/x? is close to 0, so e~!/”’ is close to 1 (Figure 7). 


FIGURE 7 


The behavior of f near 0 is more interesting. If x is small, then 1/x? is large, 
so 61,5} is large, so 4 "Δ = 1/(e!/*’) is small. This argument, suitably stated 
with é’s and 6’s, shows that 

lim 9 115 = 0. 


xz—0 


fev) x 0 
a= | 0, x = 0, 


Therefore, if we define 


then the function f is continuous (Figure 8). In fact, f is actually differentiable 


FIGURE 8 
at 0: Indeed 
ε΄ -1}} 
f'(0) = lim 
h~ 0 h 
1 
ΝΕΩ͂Ν 


294 Derivatives and Integrals 


We already know that 


lim — = ©; 
zoe X 
it is all the more true that 
e@) 
lim = 0, 
“-)΄ὸὼ καὶ 
and this means that 
: x 
lim ——- = 0 
= (=) 


Thus 
—l/x2 , 2 τέ 0 
6) = i xo * 


0, x= 
We can now compute that 


ny oe, ΚΣ -£O 
πον aa 


ge 11} ΝΜ 2 
8 
= lim 
h—-0 h 
1 
“τ ht Ox 
= lim = lim —, = lim —; 
ro = hh ao elf gee? 


an argument similar to the one above shows that f’(0) = 0. Thus 


—1/z2 —6 —1/z2 4 
f(x) = eu rer ia aE 1/ saa x #0 
0, x= 0. 


This argument can be continued. In fact, using induction it can be shown 
(Problem 29) that f (0) = 0 for every k. The function f is extremely flat at 0, 


f(x) = 


e~V* sin 1/x, x #0 
0, x=0 


FIGURE 9 


The Logarithm and Exponential Functions 295 


and approaches 0 so quickly that it can mask many irregularities of other 
functions. For example (Figure 9), suppose that 


-- | 
e Ue. gin Ὁ, x € O 


f(x) = x 
0, x = 0, 


It can be shown (Problem 30) that for this function it is also true that 
f (0) = 0 for all k. This example shows, perhaps more strikingly than any 
other, just how bad a function can be, and still be infinitely differentiable. In 
Part IV we will investigate even more restrictive conditions on a function, 
which will finally rule out behavior of this sort. 


PROBLEMS 


1. Differentiate each of the following functions (remember that a® always 
denotes a), 


() fle) =e”. 

(ii) 7 = log(1 + log(1 + log(1 + ete*))). 
(iii) f(x) = (sin x)ti2 Gia), 

(iv) 70) = or**). 

(v) f(x) = sin x%07"", 

(i) fla) = login sin, 6 

(vii) f(x) = | arcsin (- Ὶ 


n x 

(viii) f(x) = (log(3 + e*))e*” + (arcsin x)!>8°, 
(ix) f(x) = (og x)***. 
(x) 1.) = x. 


2. Graph each of the following functions. 
(a) f(x) = ει, 


(b) f(x) = em. 
(c) f(x) = ε +e. (Compare the graphs with the graphs of exp and 


(d) f(x) = &* — e&*. } 1/exp.) 
95 -οῷ e* —] 2 
τε, gene 


3. Find the following limits by l’Hépital’s Rule. 


en. eee eens 
(i) lim εἶ — 1 x [2 
2-0 x 


ay ge Cm 1L— x — x?/2 — x3/6 
11) im —————_____——____. 
(i) «-Ὁ χϑ 
aoe i Na nae 
EN ean eal 
z—0 x 


296 Derivatives and Integrals 


ἐὰν y αὐ Ὁ γϑ = 1} 


a (COS x, sin x) 


(b) 


FIGURE 10 


log(1 +x) —x+ x?/2 


(iv) lim 
z—>0 x? 
log(1 -- 2 
tha og(1 +x) —x +x [2 
z—0 x? 
= 2/7 __ vB 
ial log(1 + x) — x + x?/2 — x [3 
z—0 x3 
The functions 
sinh x= es ἕ ) 
2 
cosh x= ia a δὰ 
2 
cane ἀϑοτ ῖι ποῖ Ξι τ ἐς. ἢ 
ὅσ kt Tl ert +t. | 


are called the hyperbolic sine, hyperbolic cosine, and hyperbolic 
tangent, respectively (but usually read sinch, cosh, and tanch). There 
are many analogies between these functions and their ordinary trigo- 
nometric counterparts. One analogy is illustrated in Figure 10; a proof 
that the region shown in Figure 10(b) really has area x/2 is best deferred 
until the next chapter, when we will develop methods of computing 
integrals. Other analogies are discussed in the following three problems, 
but the deepest analogies must wait until Chapter 26. If you have not 
already done Problem 2, graph the functions sinh, cosh, and tanh. 


Prove that 
(a) cosh? — sinh? = 1. 
(b) tanh? + 1/cosh? = 1. 
(c) sinh(x + y) = sinh x cosh y + cosh «x sinh y. 
(d) cosh(x + y) = cosh x cosh y + sinh x sinh y. 
(e) sinh’ = cosh. 
(f) cosh’ = sinh. 
1 


cosh? 


(6) tanh’ 


The functions sinh and tanh are one-one; their inverses, denoted by arg 
sinh and arg tanh (the ‘‘argument”’ of the hyperbolic sine and tangent), 
are defined on R and (—1, 1), respectively. If cosh is restricted to [0, ©) 
it has an inverse, denoted by arg cosh, which is defined on [1, ©). 
Prove, using the information in Problem 5, that 


(a) sinh(cosh™! x) = Vx? — 1. 
(b) cosh(sinh™ 4x) = νΊ - x2. 


(c) (δἰπῃ 1), ) = Tre 


11. 
12. 


The Logarithm and Exponential Functions 297 


(4) (cosh™})’(x) = ——= for x > 1. 


(e) (tanh~!)’(x) for |x| < 1. 


1 -- 


(a) Find an explicit formula for sinh~}, cosh™, and tanh™ (by solving 
the equation y = sinh”! x for x in terms of y, etc.). 


(b) Find 
᾿ 1 
---Ξ Ξ--- ἄχ, 
Lws 
[ : dx fe δ» 1} ὺ «1 
-------τ- dx fora,db>iora,od<l, 
aVx?— 1 


ὃ 
[ go 
a 1 — x? 


Compare your answer for the third integral with that obtained by 
writing 


Find 


(a) lim αὖ for 0 < a < 1. (Remember the definition!) 


“- 0 


x 
b) li 
tb) cae (log x)” 


(c) lim (og x)” 
zr ὦ x 


(—1)” (τος Ὕ 


(4) lim x(log x)”. Hint: x(log x)” = 
z-—>O0t 
x 


(e) lim x’. 
zx 0t 

Graph f(x) = x* for x > 0. (Use Problem 8(e).) 

(a) Find the minimum value of f(x) = e*/x” for x > 0, and conclude 
that f(x) > e"/n” for x > n. 

(b) Using the expression f’(x) = σα — n)/x"*}, prove that f(x) > 
e™t1/(n + 1)"+1 for x > n + 1, and thus obtain another proof that 
lim f(x) = ©. 


Graph f(x) = e/x”. 
(a) Find lim log(1 + y)/y. (You can use l’Hépital’s Rule, but that 
y- 0 


would be silly.) 
(b) Find lim x log(1 + 1/x). 


298 Derivatives and Integrals 


13. 
14, 


15. 


*16. 


*17. 


*18. 


*19, 


*20. 
21. 


(c) Prove that ὁ = lim (1 + 1/x)*. 
(d) Prove that 66 = lim (1 + a/x)*. (It is possible to derive this from 


part (c) with just a little algebraic fiddling.) 
*(e) Prove that log ὁ = lim x(b”* — 1). 


Graph f(x) = (1 + 1/x)* for x > 0. (Use Problem 12(c).) 

If a bank gives a percent interest per annum, then an initial investment J 
yields J(1 + a/100) after 1 year. If the bank compounds the interest 
(counts the accrued interest as part of the capital for computing interest 
the next year), then the initial investment grows to 1(1 -+ 2/100)” after 
n years. Now suppose that interest is given twice a year. The final amount 
after n years is, alas, not J(1 + a/100)2”, but merely 7(1 + 2/200)2>— 
although interest is awarded twice as often, the interest must be halved 
in each calculation, since the interest is a/2 per half year. This amount 
is larger than (1 + a/100)”, but not that much larger. Suppose that the 
bank now compounds the interest continuously, i.e., the bank considers 
what the investment would yield when compounding & times a year, and 
then takes the least upper bound of all these numbers. How much will an 
initial investment of 1 dollar yield after 1 year? 

Let j(x) = log |x| for x ~ 0. Prove that f’(x) = 1/x for x ¥ 0. 

Prove that if 7} = cf for some number c, then f(x) = ἀξ for some 
number ἀ. 

A radioactive substance diminishes at a rate proportional to the amount 
present (since all atoms have equal probability of disintegrating, the 
total disintegration is proportional to the number of atoms remaining). 
If A(t) is the amount at time ¢, this means that A’(t) = cA(é) for some c 
(which represents the probability that an atom will disintegrate). 


(a) Find A(¢) in terms of the amount 40 = 4(0) present at time 0. 
(b) Show that there is a number τ (the “half-life”? of the radioactive ele- 
ment) with the property that A(t + τὴ) = A(t) /2. 


Newton’s law of cooling states that an object cools at a rate proportional 
to the difference of its temperature and the temperature of the surround- 
ing medium. Find the temperature T(t) of the object at time ¢, in terms 
of its temperature 70 at time 0, assuming that the temperature of the 
surrounding medium is kept at a constant, M. Hint: To solve the differen- 
tial equation expressing Newton’s law, remember that 7” = (T — M)’. 


Prove that if f(x) = i * f(t) dt, then f = 0. 
Find all functions f satisfying f’(t) = f(t) + ie f(t) dt. 
(a) Prove that 


x? x3 ae 
t+e+t+—-4+-—-+.---+—<e forx>O. 
2! 3! n! 


Hint: Use induction on n, and compare derivatives. 


FIGURE 11 


22. 


ad i 


*24. 


The Logarithm and Exponential Functions 299 


(b) Give a new proof that lim ¢7/x” = ο. 


Give yet another proof of this fact, using the appropriate form οἵ] H6pi- 
tal’s Rule. (See Problem 11-38.) 

A point P is moving along a line segment AB of length 10’ while another 
point Q moves along an infinite ray (Figure 11). The velocity of P is 
always equal to the distance from P to B (in other words, if P(¢) is the 
position of P at time ¢, then P’(¢) = 107 — P(t)), while Q moves with 
constant velocity Q’(t) = 107. The distance traveled by Q after time ¢ 15 
defined to be the Napiertan logarithm of the distance from P to B at time ἐ. 
Thus 

107t = Nap log[107 — P(2)}. 


This was the definition of logarithms given by Napier (1550-1617) in his 
publication of 1614, Mirifict logarithmonum canonis description (A Descrip- 
tion of the Wonderful Law of Logarithms); work which was done before 
the use of exponents was invented! The number 10’ was chosen because 
Napier’s tables (intended for astronomical and navigational calcula- 
tions), listed the logarithms of sines of angles, for which the best possible 
available tables extended to seven decimal places, and Napier wanted to 
avoid fractions. Prove that 
7 


1 
Nap log x = 10’ log τς 
Χ 


Hint: Use the same trick as in Problem 18 to solve the equation for P. 

(a) Sketch the graph of f(x) = (log x)/x (paying particular attention 
to the behavior near 0 and οὐ). 

(b) Which is larger, εἶ or ποῦ 

(c) Prove that if0 <x < 1, orx = 6, then the only number y satisfying 
xY = γε is y = x; but if x > 1, x * e, then there is precisely one 
number y # x satisfying x” = γῆ; moreover, if x < e, then y > ¢, 
and if x > e, then y « e. (Interpret these statements in terms of the 
graph in part (a) !) 

(4) Prove that if x and y are natural numbers and x¥ = y*, then x = y 
orx = 2,y =4,orx = 4,y = 2. 

(e) Show that the set of all pairs (x, y) with αν = y* consists of a curve 
and a straight line which intersect; find the intersection and draw a 
rough sketch. 


**(f) For 1 <x <e let g(x) be the unique number > e with χϑία) = 


g(x). Prove that g is differentiable. (It is a good idea to consider 
separate functions, 


ae eee O<x<e 
x 
f(x) = log τ ὃς αὶ 


x 


300 Derivatives and Integrals 


*25. 


26. 


ἘΖ7, 


*28. 


"29. 


*30. 


31. 


and write g in terms of f; and f2. If you do this part properly you 
should be able to show that 


yy — ἰ46}}}2 1 — log x 
g(x) = 1 — log g(x) a“ ) 


This problem uses the material from the Appendix to Chapter 11. 


(a) Prove that exp is convex and log is concave. 


(b) Prove that if Σ pf; = 1 and all p; 0, then 


t=] 
21?) ° * «© 6 ee i < fiz, + yo Τ᾿ Pazne 


(Use Problem 8 from the Appendix to Chapter 11.) 
(c) Deduce another proof that G, < A, (Problem 2-20). 


Suppose f satisfies f’ = f and f(x + y) = f(x)f(y) for all x and y. Prove 
that f = exp or f = 0. | 

Prove that if f is continuous and f(x + y) = f(x)f(y) for all x and y, then 
either f = 0 or f(x) = [f(1)]* for all x. Hint: Show that f(x) = [f(1) 
for rational x, and then use Problem 8-6. This problem is closely related 
to Problem 8-7, and the information mentioned at the end of Problem 
8-7 can be used to show that there are discontinuous functions f satisfy- 
ing f(x + )) = f(x)f(y). 

Prove that if f is a continuous function defined on the positive reals, and 
f(xy) = f(x) + f(y) for all positive x and y, then f = 0 or f(x) = 
f(e) log x for all x > 0. Hint: Consider g(x) = f(e*). 

Prove that if f(x) = e—'/*" for x # 0, and f(0) = 0, then f (0) = 0 for 
all k. 

Prove that if f(x) = e~/* sin 1/x for x ¥ 0, and f(0) = 0, then f (0) = 
0 for all &. 

(a) Prove that if α is a root of the equation 


(5) anx™ + dnik™ 1+ +++ + ax + ao = 0, 
then the function y(x) = e* satisfies the differential equation 


(Ἐπ) any” + any") + +++ bay’ tay = 0. 


*(b) Prove that if a is a double root of (*), then y(x) = xe also satisfies 


(**). Hint: Remember that if a is a double root of a polynomial 
equation f(x) = 0, then f’(a) = 0. 


*(c) Prove that if α is a root of (*) of order 7, then y(x) = x*e** is a solu- 


tion for0 <k <r — 1. 


If (*) has n real numbers as roots (counting multiplicities), part (c) 
gives n solutions y1, . . . , ¥n Of (**). 


(d) Prove that in this case the function cy; + ‘** + ¢nyn also 
satisfies (**). 


The Logarithm and Exponential Functions 301 


It is a theorem that in this case these are the only solutions of (Ἐπ). 
Problem 16 and the next two problems prove special cases of this 
theorem, and the general case is considered in Problem 19-17. 
In Chapter 26 we will see what to do when (*) does not have n real 
numbers as roots. , 


*32. Suppose that f satisfies f’’ — f = 0 and f(0) = f’(0) = 0. Prove that 
f = 0 as follows. 


(a) Show that Καὶ — (f’)? = 0. 

(b) Suppose that f(x) ~ 0 for all x in some interval (a, ὁ). Show that 
either f(x) = ce? or else f(x) = οὐ for all x in (a,b), for some 
constant 6. 

**(c) If f(x9) ¥ 0 for x») > 0, say, then there would be a number a such 
that 0 <a< x. and f(a) = 0, while f(x) #0 for a<x < xo. 
Why? Use this fact and part (b) to deduce a contradiction. 


*33. (4) Show that if f satisfies f’’ — f = 0, then f(x) = ae? + be~* for some 
a and ὁ. (First figure out what a and ὦ should be in terms of f(0) 
and /’(0), and then use Problem 32.) 
(Ὁ) Show also that f = a sinh + ὦ cosh for some (other) a and ὁ. 
34. Find all functions / satisfying 


(a) Ἂν» = fey. 
ὦ) 7.59 = fOr”, 

*35. This problem, a companion to Problem 15-28, outlines a treatment of 
the exponential function starting from the assumption that the differen- 
tial equation f’ = f has a nonzero solution. 


(a) Suppose there is a function f ¥ 0 with f’ = f. Prove that f(x) # 0 
for each x by considering the function g(x) = f(xo + x)f(xo — x), 
where f(x) # 0. 

(b) Show that there is a function f satisfying f’ = f and f(0) = 1. 

(c) For this f show that f(x + y) = f(x) : f(y) by considering the func- 
tion g(x) = f(x + y)/flx). 

(d) Prove that f is one-one and that (f—!)’(x) = 1/x. 


**36, A function f is said to grow faster than g if lim f(x)/g(x) = ©. For 


example, exp grows faster than any polynomial function. Suppose that 
σι, £2, Z1, . . . are continuous functions. Show that there is a continuous 
function f which grows faster than each g;. 

- 837. Prove that logio 2 is irrational. 


CHAPTER 


INTEGRATION IN ELEMENTARY TERMS 


Every computation of a derivative yields, according to the Second Funda- 
mental Theorem of Calculus, a formula about integrals. For example, 


if F(x) = x(log x) — x, then F’(x) = log x; 
consequently, 
[ log x dx = F(b) — F(a) = b(log δ) — ὁ — [a(log a) — αἰ, 0 « α, ὁ. 
Formulas of this sort are simplified considerably if we adopt the notation 


b 
| F(x) : = F(b) — F(a). 

We may then write 
[ log x dx = x(log x) — x | 


b 
a. 


This evaluation of ἢ log χ dx depended on the lucky guess that log is the 


derivative of the function F(x) = x(log x) — x. In general, a function F 
satisfying F’ = f is called a primitive of αὶ Of course, a continuous function f 
always has a primitive, namely, 


F(x) = [Fh 


but in this chapter we will try to find a primitive which can be written in 
terms of familiar functions like sin, log, etc. A function which can be written 
in this way is called an elementary function. To be precise,* an elementary 
function is one which can be obtained by addition, multiplication, division, 
and composition from the rational functions, the trigonometric functions and 
their inverses, and the functions log and exp. 

It should be stated at the very outset that elementary primitives usually 
cannot be found. For example, there is no elementary function F such that 


F'(x) = efor all x 


(this is not merely a report on the present state of mathematical ignorance; it 
is a (difficult) theorem that no such function exists). And, what is even worse, 
you will have no way of knowing whether or not an elementary primitive can 
be found (you will just have to hope that the problems for this chapter contain 


* The definition which we will give is precise, but not really accurate, or at least not quite 
standard. Usually the elementary functions are defined to include “‘algebraic’’ functions, 
that is, functions g satisfying an equation 


(g(x))” + far(x) (g(x)? + + + + folx) = 9, 
where the f; are rational functions. But for our purposes these functions can be ignored. 


302 


Integration in Elementary Terms 303 


no misprints). Because the search for elementary primitives is so uncertain, 
finding one is often peculiarly satisfying. If we observe that the function 


2 
F(x) = x arctan x — el) 
satisfies 


F’(x) = arctan x 


(just how we would ever be led to such an observation is quite another matter), 
so that 
I ὃ log(1 + x?) Κ 
eat Se ee ? 
a 


arctan x dx = x arctan x . 


a 


b 
then we may feel that we have “really” evaluated Ἰ: arctan x dx. 


This chapter consists of little more than methods for finding elementary 
primitives of given elementary functions (a process known simply as “integra- 
tion’), together with some notation, abbreviations, and conventions designed 
to facilitate this procedure. This preoccupation with elementary functions can 
be justified by three considerations: 


(1) Integration is a standard topic in calculus, and everyone should 
know about it. 

(2) Every once in a while you might actually need to evaluate an integral, 
under conditions which do not allow you to consult any of the 
standard integral tables (for example, you might take a (physics) 
course in which you are expected to be able to integrate). 

(3) The most useful ‘“‘methods” of integration are actually very impor- 
tant theorems (that apply to all functions, not just elementary ones). 


Naturally, the last reason is the crucial one. Even if you intend to forget 
how to integrate (and you probably will forget some details the first time 
through), you must never forget the basic methods. 

These basic methods are theorems which allow us to express primitives of 
one function in terms of primitives of other functions. To begin integrating we 
will therefore need a list of primitives for some functions; such a list can be 
obtained simply by differentiating various well-known functions. The list 
given below makes use of a standard symbol which requires some explanation. 
The symbol 


[f or [ὦ 


means “8 primitive of f’ or, more precisely, “τῆς collection of all primitives 
of f.” The symbol ff will often be used in stating theorems, while [f(x) dx is 
most useful in formulas like the following: 


4 
[ 8a =i. 
4 


304 Derivatives and Integrals 


This ‘equation’? means that the function F(x) = x*/4 satisfies F’(x) = x°. It 
cannot be interpreted literally because the right side is a number, not a func- 
tion, but in this one context we will allow such discrepancies; our aim is to 
make the integration process as mechanical as possible, and we will resort to 
any possible device. Another feature of the equation deserves mention. Most 
people write 


x4 
ὃ ἀχ = — + Ὁ, 


to emphasize that the primitives of f(x) = x* are precisely the functions of the 
form F(x) = x? + C for some number C. Although it is possible (Problem 11) to 
obtain contradictions if this point is disregarded, in practice such difficulties 
do not arise, and concern for this constant is merely an annoyance. 

There is one important convention accompanying this notation: the letter 
appearing on the right side of the equation should match with the letter 
appearing after ‘“‘d’’ on the left side—thus 


᾿ 4 
[oa = εἰ 
4 


Vaca = 


J 


[- dt 


A function in f f(x) dx, i.e., a primitive of f, is often called an “indefinite 
integral’? of f, while ie f(x) dx is called, by way of contrast, a “definite 


integral.’ This suggestive notation works out quite well in practice, but it is 
important not to be led astray. At the risk of boring you, the following fact is 


emphasized once again: the integral [ f(x) dx is not defined as ““F(b) — F(a), 


where F is an indefinite integral of f’’ (if you do not find this statement repe- 
titious, it is time to reread Chapter 13). 

We can verify the formulas in the following short table of indefinite integrals 
simply by differentiating the functions indicated on the right side. 


[eas = ax 


rt 
"dx = ~ —1 
[- ‘ n+1° " 


1 | a ; dx ; ὌΝ 
—dx = log x — dx is often written | — for convenience; similar 
x x 


x 
[τὸ 


abbreviations are used in the last two examples of this 
table.) 


THEOREM 1 (INTEGRATION BY PARTS) 


Integration in Elementary Terms 305 


sin x dx = — cos x 
cos xdx = sin x 
sec? x dx = tan x 


sec x tan x dx = sec x 


a ἀκστα, 


= arctan x 


(ee 


Two general formulas of the same nature are consequences of theorems 
about differentiation: 


SLA) + g(x)] dx = Sfx) dx + [40] dx, 
[ὁ f(x) dx = c+ [f(x) dx. 


These equations should be interpreted as meaning that a primitive of f + g 
can be obtained by adding a primitive of f to a primitive of g, while a primitive 
of c- f can be obtained by multiplying a primitive of f by c. 

Notice the consequences of these formulas for definite integrals: If f and g 
are continuous, then 


= arcsin x 


δ δ b 
[UG + φῶ = fo fx) de +f ole) ἃ, 
[ ς' [(Χ) ἀχ =o i f(x) dx. 

These follow from the previous formulas, since each definite integral may be 
written as the difference of the values at a and ὁ of a corresponding primitive. 
Continuity is required in order to know that these primitives exist. (Of course, 
the formulas are also true when f and g are merely integrable, but recall how 
much more difficult the proofs are in this case.) 


The product formula for the derivative yields a more interesting theorem, 
which will be written in several different ways. 


If f’ and g’ are continuous, then 
| fe’ =fe- [ fe, 
[ΠΤ de = fg) — f gl) ax, 
[ὦ dx = f — fF @)g@) ax, 


(Notice that in the second equation f(x)g(x) denotes the function f° g.) 


3506 Dertvatives and Integrals 


PROOF 


The formula 


(fg)’ = fig + fe’ 
fe’ = (fe)’ — 7: 
[π΄ = [to -- [fe 


and fg can be chosen as one of the functions denoted by f(fg)’. This proves 
the first formula. 

The second fomula is merely a restatement of the first, and the third 
formula follows immediately from either of the first two. J 


can be written 


Thus 


As the following examples illustrate, integration by parts is useful when the 
function to be integrated can be considered as a product of a function f, whose 
derivative is simpler than f, and another function which is obviously of the 
form g’. 

[5 dx = χοῦ — [ 1 +e dx 
μ a) ae 
ig fe es 
[ «sin x dx = x-+(~— cos x) — J 1. (— cos x) dx 

yt ᾿ { | | 

y 38 ΓΛ 4 ῇ 8 
= —xcos x + sin x. 

There are two special tricks which often work with integration by parts. 


The first is to consider the function g’ to be the factor 1, which can always be 
written in. 


[log χ ἀν = f 1 -log xdx = xlog x — [σ΄ (Μὴ dx 
| te 4 yo ot 
“ 3} ef. . & 7 


= x(log x) — x. 


The second trick is to use integration by parts to find [ἡ in terms of fh again, 


and then solve for {z. A simple example is the calculation 


f (1/x) - log x dx = log x- log x — [ (1/x) + log x dx, 
| Υ { | ! 
“ 1 ge f ἔἐ ε 
which implies that 
2 [τς x dx = (log x)? 


Integration in Elementary Terms 307 


or 
(log x)? 


[16:4 = 
x 


A more complicated calculation is often required: 


e sinxdx = &+(— cos x) — f + (— cos x) dx 


Le ἢ { J | 4 
} “' f g fg 
= —& cosx + [ @ cos x dx 
ee! 
uv 
= —e cos x + [67 - (sin x) — [δίῃ x) dx]; 
᾿ { { 
u υ μ ὺυ 
therefore, 
2 fe sin x dx = e*(sin x — cos x) 
or 


[ e* sin x dx = @(sin "59 


Since integration by parts depends upon recognizing that a function is of 
the form g’, the more functions you can already integrate, the greater your 
chances for success. It is frequently reasonable to do a preliminary integration 
before tackling the main problem. For example, we can use parts to integrate 


[ (log x)? dx = [ (log χ)(ϊορ x) dx 
{ | 
fg 
if we recall that [ log x dx = x(log x) — x (this formula was itself derived by 
integration by parts); we have 


[ (log x) (log x) dx = (log x)[x(log x) -- x] — [ (1/x)[x(log x) — x] dx 
ψ a u { 1 { 
fg f g f g 
= (log x)[x(log x) — x] — [ [log x — 1] dx 
= (log «)[x(log x) — x] — f log xdx + f 1 dx 
= (log x)[x(log x) — x] — [x(og x) — x] +x 
x(log x)? — 2x(log x) + 2x. 


The most important method of integration is a consequence of the Chain 
Rule. The use of this method requires considerably more ingenuity than 
integrating by parts, and even the explanation of the method is more difficult. 


308 Derivatives and Integrals 


(THE SUBSTIUTIO 


THEOREM 2 
N FORMULA) 


PROOF 


We will therefore develop this method in stages, stating the theorem for defi- 
nite integrals first, and saving the treatment of indefinite integrals for later. 


If f and g’ are continuous, then 
[f= [ eae’ 
fw) du = f° flee) 40) ae. 


g(b) 
g(a) 


If F is a primitive of f, then the left side is F(g(b)) — F(g(a)). On the other 
hand, 
(Fo 4)' = (Fog): ς΄ = (fog) 8’, 
so Fog is a primitive of (fog) - σ΄ and the right side 15 
(Fo g)(b) — (Fe g)(a) = F(g(b)) — F(g(a)). Ε΄ 


The simplest uses of the substitution formula depend upon recognizing 
that a given function is of the form (7. ο g) " g’. For example, the integration of 


b os 
[ sin’ x cos x dx (= [ (sin x)® cos x dx) 
a a 


is facilitated by the appearance of the factor cos x, which will be the factor 
σ΄ (Δ) for g(x) = sin x; the remaining expression, (sin x)*, can be written as 


(g(~))® = f(g(x)), for f(u) = εὖ. Thus 


ὃ Η 
ΠΥ g(x) = sin *| 
[ sin® x cos x dx Bs ats 


b g(b) 
= [Κων de =f" flu) ae 


sin ὃ - 6 . 6 
| 57 _ sin δ sin a 
5 


μ᾽ au = 


in a 6 6 


. e b . . . Ld 
The integration of ! tan x dx can be treated similarly if we write 


ὃ b’— sin x 
᾿ tanxdx = -- [ —_——— dx. 
a a COS x 


In this case the factor — sin x is g’(x), where g(x) = cos x; the remaining 
factor 1/cos x can then be written f(cos x) for f(u) = 1/u. Hence 


; g(x) = cos x 
[ tan x dx με -Σ | 


b g(b) 
— ['Hecee'@) de = - [fe ae 


cos b 
— i Pay = log(cos a) — log(cos δ). 


οϑα U 


i 


Integration in Elementary Terms 309 


Finally, to find 


ed 
| dx, 
a x log x 


notice that 1/x = g’(x) where g(x) = log x, and that 1/log x = f(g(x)) for 
f(u) = 1/u. Thus 


> 1 g(x) = log x 
errs - 


u 


b σ() 
= [fee ax =f" fle) de 


log ὃ 1 
= [ - ἀμ = log(log b) -- log(log a). 
loga U 

Fortunately, these uses of the substitution formula can be shortened con- 
siderably. The intermediate steps, which involve writing 


iP He )g'@) dx = [ῷ ΜΚ de, 


g(a) 


can easily be eliminated by noticing the following: To go from the left side 
to the right side, 
u for g(x) 
du for g’(x) dx 
(and change the limits of integration) ; 


substitute | 


the substitutions can be performed directly on the original function (account- 
ing for the name of this theorem). For example, 


b - sin ὃ 
: : u for sin x 
[ sin® x cos x dx | substitute = [ u® du, 
a du for cos x dx sin a 


and similarly 


b’— sin x ; u for cos x cos ὃ 1 
-- dx | substitute ‘ = — du. 
a cosx du for — sin x dx saan Ἢ 


Usually we abbreviate this method even more, and say simply: 


“Tet u = g(x) 
du = g’(x) dx.” 


> 1 let u = log x log b { 
dx 1 = — du. 
a x log x du ax loga U 


x 


Thus 


In this chapter we are usually interested in primitives rather than definite 
b : 
integrals, but if we can find : f(x) dx for all a and 6, then we can certainly 


510 Derivatives and Integrals 


find { f(x) dx. For example, since 


sin®6 ~~ sin® a 


6 6 


ὃ 


b 
[ sin® x cos x dx = 
a 
it follows that 
sin® x 


6 


[ 51η x cos χ dx = 
Similarly, 


[tan xas = — log x cos x, 


1 
[ dx = log(log x). 


x log x 


It is quite uneconomical to obtain primitives from the substitution formula by 
first finding definite integrals. Instead, the two steps can be combined, to 
yield the following procedure: 


(1) Let 


u = g(x), 
du = g'(x) dx; 


(after this manipulation only the letter u should appear, not the 
letter x). 

(2) Find a primitive (as an expression involving x). 

(3) Substitute g(x) back for u. 


Thus, to find 
J sin® x cos x dx, 
(1) let 
u = sin x, 
du = cos x dx, 
so that we obtain 


(2) evaluate 


(3) remember to substitute sin x back for u, so that 


sin® x 


6 


J sins x cos xdx = 


Similarly, if 
u = log x, 


du = - dx, 
x 


Integration in Elementary Terms 311 


then 


1 1 
[ dx becomes [ - du = log τι, 
x log x u 


so that 


1 
| one dx = log(log x). 


To evaluate 


x 
tte dx, 
let 
u=1+ x’, 
du = 2x dx; 
the factor 2 which has just popped up causes no problem—the integral 
becomes 
1 fi 1 
=f oa = 5 10g 4, 
80 


[ ee νὴ ἡ ὦ 

pom 3 og ( x): 

(This result may be combined with integration by parts to yield a formula 
already mentioned: 


x 
1+ x? 


= x arctan x — 4 log(1 + x?).) 


/ 1- arctan x dx = x arctan x — | dx 


These applications of the substitution formula* illustrate the most straight- 
forward and least interesting types—once the suitable factor g’(x) is recog- 
nized, the whole problem may even become simple enough to do mentally. 
The following three problems require only the information provided by the 
short table of indefinite integrals at the beginning of the chapter and, of 
course, the right substitution (the third problem has been disguised a little by 
some algebraic chicanery). 


* The substitution formula is often written in the form 
[1 « - [ se)e'@) de, «= 6. 


This formula cannot be taken literally (after all, {f(u) du should mean a primitive of f and 
Sf(e(x)) σ' (x) dx should mean a primitive of (f ο 4) " g’; these are certainly not equal). How- 
ever, it may be regarded as a symbolic summary of the procedure which we have developed. 
If we use Leibnitz’s notation, and a little fudging, the formula reads particularly well: 


[reo au = f sey Fae 


3512 Derivatives and Integrals 


[ sec? x tan® x dx, 


[ (cos x)et" = dx, 


e* 
/ ay eer dx. 

If you have not succeeded in finding the right substitutions, you should be 
able to guess them from the answers, which are (tan® x)/6, e"*, and arcsin 65, 
At first you may find these problems too hard to do in your head, but at least 
when g is of the very simple form g(x) = ax + ὁ you should not have to waste 
time writing out the substitution. The following integrations should all be 
clear. (The only worrisome detail is the proper positioning of the constant— 
should the answer to the second be 655,3 or 2655 ? I always take care of these 
problems as follows. Clearly [e* dx = e3* - (something). Now if I differentiate 
F(x) = e%, I get F’(x) = 3655, so “something” must be 3, to cancel the 3.) 


[ a ees 
3 7 eel ἢ 


ext 
82 ao 
| e* dx 3 ? 


sin 4x 
/ cos 4x dx , 


- 2 1 
[ sin (2x + 1) dx = Ts th, 
[ εἶχ __ arctan 2x 
1+ ἄχ 2 


More interesting uses of the substitution formula occur when the factor 
g'(x) does not appear. There are two main types of substitutions where this 


happens. Consider first 
[ ee dx. 
1 -- δ 


The prominent appearance of the expression ¢* suggests the simplifying 
substitution 


“u= e, 
du = eé* dx. 


Although the expression σῇ dx does not appear, it can always be put in: 


[1ΞΞ i+eée ἘΚ19ἷ.1 
————dx = | ---ττττττ- " πα οὖ dx. 
1 - εὅ 1-¢ εἶ 


Integration in Elementary Terms 313 


We therefore obtain 


1 
| τ τ ais 
1-  μ 


which can be evaluated by the algebraic trick 


1 1 2 
| Fe edu = f + = du = —2 log(t ~ w) + logy, 
—u ou 


1—u u 


so that 


1 — é 


[: τ dx = —2 log(1 -- 672) + log & = —2 log(1 -- 47) + x. 
There is an alternative and preferable way of handling this problem, which 
does not require multiplying and dividing by ἐἷ. If we write 
u=eé, «x = logy, 
1 
dx = -- du, 
u 


then 


| as = dy immediately becomes | nee "- du. 

Most substitution problems are much easier if one resorts to this trick of 
expressing x in terms of u, and dx in terms of du, instead of vice versa. It is not 
hard to see why this trick always works (as long as the function expressing τ 
in terms of x is one-one for all x under consideration): If we apply the 
substitution ; 

u= g(x), *« = g (ὦ) 
dx = (g"*)’(u) du 
to the integral 


[ fle) ax, 


we obtain 
4) [ὠς ἡ) ὦ au. 


On the other hand, if we apply the straightforward substitution 


u = g(x) 
du = g'(x) dx 
to the same integral, 
[ Hee de = [ Ke) see’ as 
g(x) 
we obtain 
1 
2 u) * ———--—— _ du. 
2 fx g(g iu) 


The integrals (1) and (2) are identical, since (g~1)’(w) = 1/g’(g~*(u)). 


314 Derivatives and Integrals 


As another concrete example, consider 


[ 925 ᾿ 
Fo AN. 
Ve+1 


In this case we will go whole hog and replace the entire expression Ve +1 
by one letter. Thus we choose the substitution 


u= Ve +1, 
u2== e+ |, 
ue—1=eé, x = log(u? — 1), 
dx = τὰ du. 
u? — 1 


The integral then becomes 


2 2 3 
[—. “ἢ du = 2 fu? —1du = -- 2 
“ 


u2— 1 


Thus 


ez 2 
---------Ἕ.ἕ 7 ee x 1)3/2 — D(e* 1)2/2, 
Ιπξξεετξέευ (+1) 


Another example, which illustrates the second main type of substitution 
that can occur, is the integral 


[ νΊ -- x2 dx. 


In this case, instead of replacing a complicated expression by a simpler one, 
we will replace x by sin u, because V1 — sin? u = cos u. This really means 
that we are using the substitution uw = arcsin x, but it is the expression for x 
in terms of u which helps us find the expression to be substituted for dx. Thus, 


let x =sinu, [uv = arcsin x] 
dx = cos u du; 


then the integral becomes 
[ νΊ -- sin? ucos udu = [ cos? u du. 
The evaluation of this integral depends on the equation 


1 + cos 2u 


cos? u = 
2 


(see the discussion of trigonometric functions below) so that 


in 2 
[cost udu = [13 355 ay - ΕᾺ: u 


3 


ἧς 
2 : 4 


Integration in Elementary Terms 315 


and 


3 (2 ’ 
| Py Fenian 6 a a x Ν sin ( ven x) 


= 
= — ane ᾿ sin(arcsin x) - cos(arcsin x) 
__ arcsin x Ee 1 Vp x2 

2 2 


Substitution and integration by parts are the only fundamental methods 
which you have to learn; with their aid primitives can be found for 
a large number of functions. Nevertheless, as some of our examples reveal, 
success often depends upon some additional tricks. The most important are 
listed below. Using these you should be able to integrate all the functions in 
Problems 1 to 7 (a few other interesting tricks are explained in some of the 
remaining problems). 


1. TRIGONOMETRIC FUNCTIONS 


Since 
sin? x + cos? x = 1 
and 
cos 2x = cos? x — sin? x, 
we obtain 
cos 2x = cos? x — (1 — cos? x) = 2 cos? x — 1, 
cos 2x = (1 — sin? x) — sin? x = 1 — 2 sin’ x, 
or 
ἜΣ 1 — cos 2x 
sin? x = --------, 
2 
; 1 + cos 2x 
cos? x ----------- 
2 


These formulas may be used to integrate 
[ sin” x dx, 
] cos” x dx, 
if n is even. Substituting 
(1 — cos 2x) = (1 + cos 2x) 
2 ΠΣ 2 


for sin? x or cos? x yields a sum of terms involving lower powers of cos. For 
example, 


oo 2 
J sint x ax = [= dx = [15- | 0s 2x ἀκ + | cost 2x de 


316 Derivatives and Integrals 


and 


[ cost 28 dx = [Se 


If n is odd, n = 2k + 1, then 
[ sin” x dx = [ sin «(1 — cos? x)* dx; 


the latter expression, multiplied out, involves terms of the form sin x cos! x, 
all of which can be integrated easily. The integral for cos” x is treated simi- 
larly. An integral | 

[ sin” x cos” x dx 


is handled the same way if n or m is odd. If n and m are both even, use the 
formulas for sin? x and cos? x. 
A final important trigonometric integral is 


1 
/ dx = | sc x dx = log(sec x + tan x). 


COS x 


Although there are several ways of “‘deriving” this result, by means of the 
methods already at our disposal (Problem 10), it is simplest to check this 
formula by differentiating the right side, and to memorize it. 


2. REDUCTION FORMULAS 
Integration by parts yields (Problem 17) 


1 1 ee 
sins xdx = ——sin"— 1» cos x + —— sin” ? x dx, 
n n 


1 : n— 1 es 
[ cost x dx = -- cos” 1x sin x + ——— | cos”? x dx, 
n 


[ 1 χ 2n -- 3 a 

------ ἀκχ = ———— —_______- net x, 

(x? + 1)” 2n — 2 (x? + peat ew, (x? + 1)"71 

and many similar formulas. The first two, used repeatedly, give a different 
method for evaluating primitives of sin” or cos”. The third is very important 
for integrating a large general class of functions, which will complete our 
discussion. 


3. RATIONAL FUNCTIONS 


Consider a rational function p/g¢ where 


P(x) = anx™ + ἀ,-χ 1 + +++ +a, 
q(x) = bmx™ + | ee a +°°* + do. 


We might as well assume that a, = b, = 1. Moreover, we can assume that 
n<m, for otherwise we may express f/g as a polynomial function plus a 


THEOREM 


THEOREM 


Integration in Elementary Terms 317 


rational function which ἐς of this form by dividing (for example: 


2 


=ut1+——) 
u— il 


u— ἰ 


The integration of an arbitrary rational function depends on two facts; the 
first follows from the ‘“‘Fundamental Theorem of Algebra’? (see Chapter 25, 
Theorem 2 and Problem 25-3), but the second will not be proved in this book. 


Every polynomial function 
qx) = x" Omar or  δὸ 
can be written as a product 


q(x) = (x = Qty)" ἘΠ Κι ve tee (x = Oty, (x? + Bix + 1)% SS Rds see Ge 
* (x? + Bia + vi)* 
(where ἘΠ᾿ +7%4+2(5. + °°: +5) =m). 

(In this expression, identical factors have been collected together, so that all 
x — a; and x? + 8.x + y; may be assumed distinct. Moreover, we assume 
that each quadratic factor cannot be factored further. This means that 

B,? = 4; « 0, 
since otherwise we can factor 
τε ΒΞ ears 

e+ fet y= γε Ξ τ τ - i] [, (= VEE ον 


into linear factors). 


If n < m and 
p(x) ase x” + ἀπο + eee + Qo, 
g(x) = χ but 1 Ἔν bby 
= (ea). 2a τὲ πε εχ + Bx yi) a ee Be) 
then £(x)/q(x) can be written in the form 
p(x) ia |= At ee ee oe 1,7, | 
q(x) (x — a1) (x — a1)" 
= | ak,1 ae es εἰ: ak rk 
(x — αι} (x -- ay)" 
CD fs. pole, ΚῚΣ bi aX eC ee 
(x? + Bix + 1) (x? -+ Bix + i) 
| bi 1x Ἔ C1,1 Si δ᾽, εχ + ΟἹ αι | 
(x? + Bix + 1) (x? + Bia + yi)* 


This expression, known as the “partial fraction decomposition’”’ of p(x)/q(x), 
is so complicated that it is simpler to examine the following example, which 


318 Derivatives and Integrals 


illustrates such an expression and shows how to find it. According to the 
theorem, it is possible to write 
2x7 + 8x8 + 1305 + 20x4 + 15x38 + 16x? + 7x + 10 

(x? + x  1)5(3 + 2x + 2)( — 1)? 

eT b 4 _ δ +d i ex + f ΕΝ Πρ ΤΕ | 

χ-- 1 (αι --1)2 χ- 2Χχ 2 (x? +e+1) (Οὐ +4 1) 
To find the numbers a, ὁ, c, d, 6, f, g, and A, write the right side as a polynomial 
over the common denominator (x? + x + 1)?(x? + 2x + 3) — 1)?; the 
numerator becomes 


a(x — 1)(x? + 2x + 2)(e? + x + 1)? + d(x? + 2x + 2) (x? + χα - 1)? 
+ (cx + d) (x — 1)3(. + x +1)? + (ex + θα — 1)2(x? + 2x + 2). + % + 1) 
+ (gx + h)(x — 1)2(x? + 2x 4+ 2). 


Actually multiplying this out (!) we obtain a polynomial of degree 8, whose 
coefficients are combinations of a, . . . , A. Equating these coefficients with 
the coefficients of 2x7 + 8x§ + 13x + 20x4 + 15x? + 16x? + 7x + 10 (the 
coefficient of x® is 0) we obtain 8 equations in the eight unknowns a, . . . , ἡ. 
After heroic calculations these can be solved to give 


1, b=2, c=1, d=3, 
0, f=0, g=0, A= 1. 


a 
é 


Thus 


| 2x7 + 5x8 + 13x59 + 20x4 + 21χ3 + 16x? + 7x + 4 
(x2 + x + 1)?(x? + 2x + 2)(x — 1)? 


- faa ties He is +0? + [sto 


(In simpler cases the requisite calculations may actually be feasible. I obtained 
this particular example by starting with the partial fraction decomposition and 
converting it into one fraction.) 

We are already in a position to find each of the integrals appearing in the 
above expression; the calculations will illustrate all the difficulties which arise 
in integrating rational functions. 

The first two integrals are simple: 


/ : dx = log(x -- 1), 


ἃ “51 
/ oes pee eae 
(an =1)? x—1 
The third integration depends on ‘“‘completing the square”’: 


αἰ ἜΧΕ ΊΞ α Ὁ ὁ)}τ 
ER) ἘΠ 
= - 1’ 
alee us 


ax 


Integration in Elementary Terms 319 


(If we had obtained — ξ instead of # we could not take the square root, but in 
this case our original quadratic factor could have been factored into linear 
factors.) We can now write 


τς Πρ - ἡ ο 
(x? + x + 1)? 3 


The substitution 


[eau 
(Re) +1] 


“= ἘΝ , 
Vi 
du ΣΦ dx 
ν᾿ 
changes this integral to 
iu 
3J (wu? +1)? ᾿ 


which can be computed using the third reduction formula given above. 
Finally, to evaluate 


[SS x+3 
ay TE ἰὼ 


we write 


x +3 1 2x + 2 2 
ee τὸ ΠΝ ΠΥ ἡ ΤΣ. λον ee ee 
[-ἰ. 4 iors" eee 


The first integral on the right side has been purposely constructed so that we 
can evaluate it by using the substitution 


x? +- 2x + 3, 
(2x + 2) dx. 


u 


du 


I 


The second integral on the right, which is just the difference of the other two, 
is simply 2 arctan (x + 1). If the original integral were 


x +3 2x - 2 
| wing * 72) wees * +) πα τ 
5 1 2χ τ 2°“ (x2 + 2x + 2)" ((x + 1)? +1)" 


the first integral on the right would still be evaluated by the same substitution. 
The second integral would be evaluated by means of a reduction formula. 

This example has probably convinced you that integration of rational func- 
tions is a theoretical curiosity only, especially since it is necessary to find the 
factorization of g(x) before you can even begin. This is only partly true. We 
have already seen that simple rational functions sometimes arise, as in the 
integration 


320 


Derwatives and Integrals 


another important example is the integral 


1 Η 1 1 1 
ax = =~ dx = , --Ἰ — 1) - --] 1). 
[sae [--- τας ee πε. 


Moreover, if a problem has been reduced to the integration of a rational func- 
tion, it is then certain that an elementary primitive exists, even when the diffi- 
culty or impossibility of finding the factors of the denominator may preclude 
writing this primitive explicitly. 


PROBLEMS 


1. This problem contains some integrals which require little more than 
algebraic manipulation, and consequently test your ability to discover 
algebraic tricks, rather than your understanding of the integration 
processes. Nevertheless, any one of these tricks might be an important 
preliminary step in an honest integration problem. Moreover, you want 
to have some feel for which integrals are easy, so that you can see when 
the end of an integration process is in sight. The answer section, if you 
resort to it, will only reveal what algebra you should have used. 


| Vat + Vx 

(i) a Νὰ 

i) lz tVe41 
(iii) {oss ai a dx. 


(iv) ir — dx. 


(v) | tan? x dx. (Trigonometric integrals are always very touchy, 
because there are so many trigonometric identities 
that an easy problem can easily look hard.) 


wi) ver : x? 
(vii) = a= eo a 
wi) [as 


(ix) [= + 6x + 4 


dx. 
at 


ΒΞ ἘΞ 
| V 2x — x? 


Integration in Elementary Terms 321 


2. The following integrations involve simple substitutions, most of which 
you should be able to do in your head. 


(i) | e” sin é” dx. 
(ii) | χε τ dx, 
(iii) [7 66 dx. (In the text this was done by parts.) 
e” dx 
wy Paces mae 1 
(v) [ e*e* dx, 
x dx 
wi) [y= V1 — x4 1 — x# 
(vii) [Ξ aN, 
(viii) | xV1 — χϑ άκ. 
(ix) | log(cos x) tan x dx. 
(x) | log (log x) ik 


x log x 


3. Integration by parts. 


[reas 

(ii) / x8e"" dx. 

(iii) [ 665 sin bx dx. 
iv) [stained 
(v) / (log x)? dx. 
(vi) [ 808 ay 


(vii) / sec? x dx. (This is a tricky and important integral that often 
comes up. If you do not succeed in evaluating it, be 
sure to consult the answers.) 


(viii) cos(log x) dx. 


3522 Derivatives and Integrals 


(ix) [ Vx log x dx. 
(x) | x(log x)? dx. 


The following integrations can all be done with substitutions of the form 
x = sin u,x = cosu, etc. To do some of these you will need to remember 
that 

f sec x dx = log(sec x + tan x) 


as well as the following formula, which can also be checked by differ- 
entiation: 
J esc x dx = — log(csc x + cot x). 


In addition, at this point the derivatives of all the trigonometric func- 
tions should be kept handy. 


β igs 


(You already know this integral, but use the sub- 
stitution x = sin u anyway, just to see how it works 
out.) 


τ: dx 

(ii) ᾿ Ἐπ τω τί (Since tan?u + 1 = sec? u, you want to use the 

2 7 . 
1+" substitution x = tan u.) 


.- 


᾿ dx aoe 
(iv) / PY cae (The answer will be a certain inverse function that 
x Vx" — 1 was given short shrift in the text.) 


wo | w= 
w [as 


(vii) / x ΜῚ -- κ' ἀν. (You will need to remember the methods for 
(viii) | τ ἢ integrating powers of sin and cos.) 
viii — χϑ dx. 


(ix) | V1+ x? dx. 
(x) / Vx? — 1 dk. 


The following integrations involve substitutions of various types. There 
is no substitute for cleverness, but there is a general rule to follow: substi- 
tute for an expression which appears frequently or prominently; if two 
different troublesome expressions appear, try to express them both in 
terms of some new expression. And don’t forget that it usually helps to 


6. 


Integration in Elementary Terms 323 


express x directly in terms of u, to find out the proper expression to sub- 
stitute for dx. 


; dx 

0) | 1+ ‘/ x + 1 
i dx 

a) τ 

πρὶ dx 

w | yz 


= φῇ leads to an integral requir- 
ing yet another substitution; this is all right, but 
both substitutions can be done at once.) 


εἶχ 
v) | 2+ tan x 


(vi) | I (Another place where one substitution can be 
Vx +1 made to do the work of two.) 


(vii) | εἰδεαε dx. 
2* + 1 


(viii) | εν: dx. 
(ix) ips 7 ἀν. (In this case two successive substitutions work out 

Vx best; there are two obvious candidates for the first 
substitution, and either will work.) 


“@) [Nip he a 


The previous problem provided gratis a haphazard selection of rational 
functions to be integrated. Here is a more systematic selection. 


(i) | 2x° ΞΕ inl He 


Ree = τ. 
a) 2x + 1 
() Ι: 3 — 3x7 + 3x — 1 = 
(iii) [SSS + 7x? —5x-+5 
Gs Teal)? 


2x7 +x +1 
ἫΝ τι ἘΠΕ ΤῸ ἐν 


x+4 
(v) [ΞΞΞὰ 


dx. 


324 Derivatives and Integrals 


etxr+2 
a e+e i” 


wii) | Janae ae 
goa ey ἢ 


oo dx 
(viii) J rae 
; 2K 
(ix) ΓΞ ΤΣ tet)? dx 


jaan 


*7. Potpourri. (No holds barred.) The following integrations involve all the 
methods of the previous problems. 


(i) [ arctan x ἢ 
1+ x? 


(ii) / x arctan x 
(1 + x?)8 


(iii) " log V1 + x? dx. 


(iv) [ x log ΝΊ - x? dx. 
2 ea oe oe 
le 1 1: Vie 
(vi) | arcsin Vx dx. 
(vii) [ τ τ dx 
(viii) . | τ ρα ES Fe 


cos? x 


(ix) [ V tan x dx. 


(x) / ai ; (To factor «6 + 1, first factor y? + 1, using Problem 
1-1.) 


8. Ifyou have done Problem 17-7, the integrals (ii) and (iii) in Problem 4 will 
look very familiar. In general, the substitution x = cosh u often works 


for integrals involving Vx? — 1, while x = sinh u is the thing to try for 
integrals involving V2 +1. Try these substitutions on the other inte- 


Ὄ; 


*10. 


Integration in Elementary Terms 325 


grals in Problem 4. (The method is not really recommended; it is easier 
to stick with trigonometric substitutions. ) 


The world’s sneakiest substitution is undoubtedly 


x 
t = tan 5 x = 2 arctan ἐ, 


dx at. 


_ 2 
i+ 1 
As we found in Problem 15-16, this substitution leads to the expressions 
i-? 
————  cosx = ——’ 
1+ ἢ i+? 


sin x = 


This substitution thus transforms any integral which involves only sin 
and cos, combined by addition, multiplication, and division, into the 
integral of a rational function. Find 


(i) | ΒΕ ΟὟ (Compare your answer with Problem 1 (viii).) 
1 + sin x 


(ii) | me 2 eee (In this case it is better to let ¢ = tan x. Why?) 
1 — sin? x 

(iii) | eee (There is also another way to do this, using 
asin x 


+ 6c0s* problem 15-7.) 


(iv) | sin? x dx. (An exercise to convince you that this substitution 
should be used only as a last resort.) 


ax 
--  ----.-Ἅ A last resort. 
v) | ἘΠΕ Ἐτὴ Ὁ ) 


Derive the formula for f sec x dx in the following two ways: 
(a) By writing 


1 cos x 


cos x  cos?x 

ΒΕ ΣΙΝ 
1 — sin’? x 
1 COs x i COs αὶ 
ἶνες. ees | 


an expression obviously inspired by partial fraction decompositions. 
Be sure to note that f cos x/(1 — sin x) dx = — log(1 — sin x); the 
minus sign is very important. And remember that % log a = log V a. 
From there on, keep doing algebra, and trust to luck. 


326 Derivatives and Integrals 


11. 


12. 


15. 


14, 


15. 
16. 


17. 


(0) By using the substitution ¢ = tan x/2. Once again, quite a bit of 
manipulation is required to put the answer in the desired form; the 
expression tan x/2 can be attacked by using Problem 15-8, or both 

_ answers can be expressed in terms of ¢t. There is another expression 
for J sec x dx, which is less cumbersome than log(sec x - tan x); 
using Problem 15-8, we obtain 


1 + tan é 
2 x π 
| sec x dx = log ane: = log (tan (5 +- *)) 


1 — tan - 
2 


This last expression was actually the one first discovered, and was 
due, not to any mathematician’s cleverness, but to a curious histori- 
cal accident: In 1599 Wright computed nautical tables that 
amounted to definite integrals of sec. When the first tables for the 
logarithms of tangents were produced, the correspondence between 
the two tables was immediately noticed (but remained unexplained 
until the invention of calculus). 


The derivation of [65 sin x dx given in the text seems to prove that the 
only primitive of f(x) = 67 sin x is F(x) = e*(sin x — cos x)/2, whereas 
F(x) = e*(sin x — cos x)/2 + C is also a primitive for any number C. 
Where does C come from? (What is the meaning of the equation 


Je" sin x dx = e? sin x — e* cos x — [65 sin x dx?) 


(a) Find [ arcsin x dx, using the same trick that worked for log and 
arctan. 


*(b) Generalize this trick: Find f{ f—!(x) dx in terms of { f(x) dx. Compare 


with Problems 12-15 and 14-10. 

(a) Find f sin‘ x dx in two different ways: first using the reduction 
formula, and then using the formula for sin? x. 

(b) Combine your answers to obtain an impressive trigonometric 
identity. 

Express { log(log x) dx in terms of J (log x)~! dx. (Neither is expressable 

in terms of elementary functions.) 

Express fx%e~* dx in terms of [ε΄ τ dx. 

Prove that the function f(x) = e*/(e* + 65 + 1) has an elementary 

primitive. (Do not try to find it!) 

Prove the reduction formulas in the text. For the third one write 


/ dx 7 | dx _ x? dx 

[1 ea)? (1 - x7)" (Lae) 

and work on the last integral. (Another possibility is to use the substitu- 
tion x = tan uw.) 


18. 


*19. 


20. 


21. 


22. 


23. 


24. 


*25. 


Integration in Elementary Terms 327 


Find a reduction formula for 
(4) [χ" δ de: 
(b) f (log x)” dx. 
Prove that 
cosh z Ay aa 2 -- cosh x sinh x # 


1 y 2 


(See Problem 17-4 for the significance of this computation.) 
Prove that 


Τὸ dx = fl flat — x) dx. 


(A geometric interpretation makes this clear, but it is also a good exer- 
cise in the handling of limits of integration during a substitution.) 
Prove that the area of a circle of radius r is mr?. (Naturally you must 
remember that 7 is defined as the area of the unit circle.) 

Use induction and integration by parts to generalize Problem 14-0: 


‘i ee oe i ([" ( ἘΠ 0) ‘) ius) . ) a 


If f’is continuous on [a, 6], use integration by parts to prove the Riemann- 
Lebesgue Lemma for 7: 


lim [ f(t) sin (λὺ dt = 0. 


This result is just a special case of Problem 15-25, but it can be used to 
prove the general case (in much the same way that the Riemann- 
Lebesgue Lemma was derived in Problem 15-25 from the special case in 
which f is a step function). | 

Prove the following version of integration by parts for improper integrals: 


" u'(x)v(x) dx = μ(χ)υ(χ) ᾿ - Ἢ u(x)u’ (x) dx. 
The first symbol on the right side means, of course, 
lim u(x)v(x) — u(a)v(a). 
One of the most important functions in analysis is the gamma function, 
T(x) = I;  gtye—l dp. 


(a) Prove that the improper integral I'(x) is defined if x > 0. 
(b) Use integration by parts (more precisely, the improper integral 
version in the previous problem) to prove that 
T(x + 1) = xP (x). 


(c) Show that I'(1) = 1, and conclude that I'(n) = (n — 1)! for all 
natural numbers n. 


328 Derivatives and Integrals 


*26. 


The gamma function thus provides a simple example of a continuous 
function which “‘interpolates”’ the values of n! for natural numbers x. 
Of course there are infinitely many continuous functions f with 
f(n) = (n — 1)!; there are even infinitely many continuous func- 
tions f with f(x + 1) = xf(x) for all x > 0. However, the gamma 
function has the important additional property that log oT’ is 
convex, a condition which expresses the extreme smoothness of 
this function. A beautiful theorem due to Harald Bohr and Johannes 
Mollerup states that I is the only function f with log o f convex, 
ΚΙ) = 1 and f(x + 1) = xf(x). See the Suggested Reading for a 
reference. | 


(a) Use the reduction formula for f sin” x dx to show that 


n 


w/2 ῃ -- 1 π72 
sin” x dx = [ sin” 2 x dx. 
0 0 


(b) Now show that 


-72 
| sint*t x dx xz 2-4... an 
0 53. oF τ 2n +1 
a/2 τι 
[OP sin de E235... Ped 
0 22 4 6 Qn 
and conclude that 
w/2 on 
224466 ὁ mom sin x de 
1 3 A 5 7 2ῃ -- j 2ι τ 1 f sin x dx 


(c) Show that the quotient of the two integrals in this expression is 


between 1 and 1 + 1/2n, starting with the inequalities 


0 < sin?! x < sin? x < sin?"—!x for0 <x < 7/2. 
This result, which shows that the products 
2244 6 6 2n 2n 


ee ee, eS eee 


can be made as close to 7/2 as desired, is usually written as an 
infinite product, known as Wallis’ product: 


(d) Show also that the products 


A. ολυάοιος εν On 
Vn1:3°5°...+(2n — 1) 


can be made as close to Vz as desired. (This fact is used in the next 
problem and in Problem 26-18.) 


Integration in Elementary Terms 329 


**97_ It is an astonishing fact that improper integrals i. f(x) dx can often be 
: δ. ; 
computed in cases where ordinary integrals Ι, f(x) dx cannot. There is 
b 
no elementary formula for [ e~* dx, but we can find the value of 


a e~*’ dx precisely! There are many ways of evaluating this integral, 


but most require some advanced techniques; the following method 
involves a fair amount of work, but no facts that you do not already 
know. 


(a) Show that 


ih (1 — x)% de = 2. : ΝΣ ons 


3 
[πέντ 
ο (1 + x7)” 2 


(This can be done using reduction formulas, or by appropriate 
substitutions, combined with the previous problem. ) 
(b) Prove, using the derivative, that 


ee 
4 


1— χ «ες τὶ [γ0 <x <1. 


ree 


for 0 < x. 
1+ x? 


(c) Integrate the nth powers of these inequalities from 0 to 1 and from 0 
to %, respectively. Then use the substitution y = V/nx to show that 


-2 4 2n 
3. 5 2n + 1 
«[ ον ἀν < Pa ν᾽ dy 
Jo Jo 
πος Zn -- 3 
~ 2 2 4 2n — 2 


(d) Now use Problem 26(c) to show that 


=] τ τὰ νπ 
i e ¥ dy = ---τοττοο 
0 2 


**28. (a) Use integration by parts to show that 


bigs 
sin x COS ὦ COS oie > cos x 
i dx = -- ——— dx, 

a x a 
and conclude that a (sin x),/x dx exists. (Use the left side to investi- 


gate the limit as a > ΟἿ and the right side for the limit as 6 > © .) 


330 Derivatives and Integrals t 


(b) Use Problem 15-31 to show that 


πε 1 
[ sin (n + Le 
. ἐ 

sin — 
0 2 


for any natural number n. 
(c) Prove that 


7 2 
lim [ sin(A + 2), 7 os dt = 0. 
0 


v\—? 4 
sin - 
Z 


Hint: The term in brackets is bounded by Problem 15-2(vi); the 
Riemann-Lebesgue Lemma then applies. 
(4) Use the substitution μα = (A + $)¢ and part (b) to show that 


° sin x τ 
| dx = — 
0 x 2 


*99, (a) Use the substitution u = ἐξ to show that 


1 00 
L@)= : ede, 
0 


x 


(Ὁ) Find I'(4). 


ΡΑΚΊ 4 


INFINITE 
SEQUENCES 
AND 
INFINITE 
SERIES 


One of the most remarkable series of 
algebraic analysts 15 the following : 


m m(m -— 1) ; 
εὐ ὦ: 1-2 
(m — 1) (m— 2) , | 
ba τ ὴ 
ra Mase)? ΠῚ εΞ in Daa 
eee eee "" 
ἘΞ 


when m 1s a positive whole number 

the sum of the series, 

which is then finite, can be expressed, 

as 1s known, by (1 + x)™. 

When m τς not an integer, 

the series goes on to infinity, and it will 
converge or diverge according 

as the quantities 

m and x have this or that value. 

In this case, one writes the same equality 


x* + τ... ete. 


νον Lt 1s assumed that 

the numerical equality will always occur 
whenever the series is convergent, but 
this has never yet been proved. 


NIELS HENRIK ABEL 


CHAPTER 


APPROXIMATION BY 
POLYNOMIAL FUNCTIONS 


There is one sense in which the “‘elementary functions’ are not elementary 
at all. If p is a polynomial function, 


p(x) =atax+t-°: + Gnx"; 


then f(x) can be computed easily for any number x. This is not at all true for 
functions like sin, log, or exp. At present, to find log x = I, * 1/t dt approxi- 
mately, we must compute some upper or lower sums, and make certain that 
the error involved in accepting such a sum for log x is not too great. Com- 
puting δ᾽ = log—!(x) would be even more difficult: we would have to com- 
pute log a for many values of a until we found a number a such that log a 15 
approximately x—then a would be approximately e?. 

In this chapter we will obtain important theoretical results which reduce 
the computation of f(x), for many functions f, to the evaluation of polynomial 
functions. The method depends on finding polynomial functions which are 
close approximations to f. In order to guess a polynomial which is appropriate, 
it is useful to first examine polynomial functions themselves more thoroughly. 


Suppose that 
p(x) = aot ax tot Ἔ ἀρχῇ. 


It is interesting, and for our purposes very important, to note that the coefh- 
cients a; can be expressed in terms of the value of p and its various derivatives 
at 0. To begin with, note that 
p(O) = ao. 
Differentiating the original expression for p(x) yields 
p(x) = ar + 2αςκ + "τ παρῆν. 

Therefore, 

δ΄) = ρ΄ Ὁ) = a. 
Differentiating again we obtain 

p(x) = 2ar + 3.2: ase + 0+ + tla — 1) + ayn? 

Therefore, 

p"(0) = pO) = 2as, 


In general, we will have 


(0 
p™(0) = kila, or ay = ee 


If we agree to define 0! = 1, and recall the notation p\ = ἡ, then this 
formula holds for ἀ = 0 also. 


333 


334 Infinite Sequences and Infinite Series 


If we had begun with a function # that was written as a “polynomial in 


(x We a),” 


p(x) = ay + ai(x δὰ a) ᾿ξ δ. ἡ An(x rai a)”, 
then a similar argument would show that 


ab (ὦ 
ks 


Suppose now that f is a function (not necessarily a polynomial) such that 


FP (a), «τ fC) 


ak 


all exist. Let 
(k) 
ay = f(a), 0 < k < n, 
ἀ! 
and define 


Paalt) = ao -Ἐ ατ(χ -- αἡ - +++ tan(x -- a)". 


The polynomial P,,, is called the Taylor polynomial of degree n for f at a. 
(Strictly speaking, we should use an even more complicated expression, like 
Pna,f, to indicate the dependence on f; at times this more precise notation 
will be useful.) The Taylor polynomial has been defined so that 


Pra” (a) = f(a) ἴογ 0 ck <n; 


in fact, it is clearly the only polynomial of degree < n with this property. 

Although the coefficients of P,4,, seem to depend upon f in a fairly com- 
plicated way, the most important elementary functions have extremely simple 
Taylor polynomials. Consider first the function sin. We have 


sin(0) = 0, 

sin’(0) = cos 0 = 1, 
sin’’(0) = — sinO = 0, 
sin’’(0) = —cos0 = —1, 


sin‘ (0) = sin 0 = 0. 


From this point on, the derivatives repeat in a cycle of 4. The numbers 


sin (0) 
αι Ξττ----- 
ἀζ! 
are 
1 1 1 1 
0, 1, Be Oe 0, 7 Dior 


Therefore the Taylor polynomial P2»+41,0 of degree 2n + 1 for sin at 0 is 


x3 x? x! ᾿ ΧΙ 
κ᾿ τω π΄ ὦ + (—1) (Qn)! 


(Of course, Pon41,0 ΞΞ Pon+2,0). 


Approximation by Polynomial Functions 335 


The Taylor polynomial P2,,9 of degree 2n for cos at 0 is (the computations 
are left to you) 
x2 
Pon,o(*) =1- 2! ΕἸ 


xt x8 xen 
Sucre ke nate ἢ -- 1)" : 
41 6! Ω ee (2n)! 

The Taylor polynomial for exp is especially easy to compute. Since exp™ (0) 
= exp(0) = 1 for all 4, the Taylor polynomial of degree n at 0 15 


x χ ee oe ee 
Pao) Ita ota, tat as are 


The Taylor polynomial for log must be computed at some point a = 0, 
since log is not even defined at 0. The standard choice is a = 1. Then 


log’(x) = 7 log’(1) = 1; 
x 
log"(x) = — ἦν log’"(1) = —1; 
x 


2 
log’’(x) = τ Ιορ"“(4) = 2; 


in general 
(—1)*-"(k — 1)! 


log (x) = a log (1) = (—1)F-1(k — 1}}. 
Therefore the Taylor polynomial of degree n for log at 1 is 

— 1)? — |} 3 ἐς | n—t1 — 1)" 

Pals) =e -1) — S&S SUM. εἰ ) “ )" 


It is often more convenient to consider the function f(x) = log(1 + χ). In 
this case we can choose a = 0. We have 


f(x) = log (1 + x), 
SO 
f®(0) = log (1) = (—1)F τῷ — 1). 


Therefore the Taylor polynomial of degree n for f at 0 is 
x? x? x? (— 1)" 1x" 
P;, = —_ — en oe ¢@ ae eS στ 
(x) = αὶ 5 ss nie zw 2 ; 


There is one other elementary function whose Taylor polynomial is impor- 
_ tant—arctan: The computations of the derivatives begin 


arctan’(x) = i arctan’(0) = 1; 
arctan’’(x) = seen ay, arctan’’(0) = 0; 
(1 + x?)? 
2\2.f— . 2). 
ΕΓ is ee) ee, eed Oe? 


(1 + x?)4 


336 Infinite Sequences and Infinite Series 


It is clear that this brute force computation will never do. However, the 
Taylor polynomials of arctan will be easy to find after we have examined the 
properties of Taylor polynomials more closely—although the Taylor poly- 
nomial P,,a,7 was simply defined so as to have the same first n derivatives at 
a as f, the connection between f and P,,.4,, will actually turn out to be much 
deeper. 

One line of evidence for a closer connection between f and the Taylor poly- 
nomials for f may be uncovered by examining the Taylor polynomial of 
degree 1, which is 

Pra(x) = fla) + f'(ay@ — a). 


f(x) = Prax) 10) — Κῶ 


x —@ xa 


Notice that 


— f(a). 


Now, by the definition of f’(a) we have 
finn £2) - Pal) 


ra xXx —a@ 


= 0. 


x? ΐ 
P» 9(x) ἘΞ ΜΝ; + x + a 


; Py (x) = 1 +x 


FIGURE 1 


In other words, as x approaches a the difference f(x) — δι «(Δ) not only 
becomes small, but actually becomes small even compared to x — a. Figure 1 
illustrates the graph of f(x) = e? and of 


Pio(x) = f(0) + f/(O)x = 1+ x, 


which is the Taylor polynomial of degree 1 for f at 0. The diagram also shows 
the graph of 

0 xe 
Profs) =fO +O +0 tar+24% 
which is the Taylor polynomial of degree 2 for f at 0. As x approaches 0, the 
difference f(x) — P2,o(x) seems to be getting small even faster than the 


THEOREM 1 


Approximation by Polynomial Functions 337 


difference f(x) — Pi,o(x). As it stands, this assertion is not very precise, but 
we are now prepared to give it a definite meaning. We have just noted that 
in general 

lim 70) x Pi.a(%) ar 0 


ra ΧΩ 


For f(x) = e? and a = 0 this means that 


Ξε Fe le J 


z—0 x x20 x 


On the other hand, an easy double application of |’Hépital’s Rule shows 
that 
ΠῚ ee a en, 


z—0 x? 


Thus, although f(x) — P1,o(x) becomes small compared to x, as x approaches 
0, it does not become small compared to x?. For P2o{x) the situation is quite 
different; the extra term x?/2 provides just the right compensation: 


5 
ἄν ll eee 
Ὁ 85 -α- ἽἼ --ἰχ 
lim = lim 
20 x? 20 2x 
1 
= lim = Ὁ 
---» .Ὁ 


This result holds in general—if 5 (4) and f’’(a) exist, then 


lim 705) -- P2.0(*) = 0: 


sa (x — a)? 


in fact, the analogous assertion for P,,, is also true. 


Suppose that f is a function for which 


ΤΣ » fa) 


all exist. Let 


and define 
Pr a(x) = a) + ay(x " a) af Oi αὶ ἘΠῚ aa ee a)”. 
Then 
im 1) Pao (*) = 0), 


xa (x a a)” 


338 Infinite Sequences and Infinite Series 


PROOF Writing out P,,,4(x) explicitly, we obtain 


: TA FO (g 
fey — YE α - a 


fe) — Pral) _ f(a), 


(x — a)” (x — a)” n! 
It will help to introduce the new functions 
n—l 
G) 
Q(x) = ) Po (x —a)! and g(x) = (x — a)" 
i=Q 1: 


now we must prove that 


fe) — Q@) _ f@), 


lim 
za g(x) n! 
Notice that 
Q™(a) = fPa), kent, 
g(x) = nix — a)" */(n — ἢ}. 
Thus 


lim [70) — Q(4)] = fla) -- Q@) = 0, 
lim [f'(x) — Q@)] = 7 -- στῶ = 0, 


lim [f° (2) νὰ τὰ — QPP) = 0, 
and 


lim g(x) = lim g’(x) = ++ * = lim g(x) = 0. 


We may therefore apply I’Hépital’s Rule n — 1 times to obtain 


fi LO) το, pe SOP) = VW), 


sa (x — a)” za ni(x — a) 


Since Q is a polynomial of degree n — 1, its (n — 1)st derivative is a constant, 
in fact, Q~ (x) = f(a). Thus 


fin $0 — 0@) _  Ξ @ 


za (x — a)” n'(x — a) 


and this last limit is f(a)/n! by definition of f(a). ἢ 


One simple consequence of Theorem 1 allows us to perfect the test for local 
maxima and minima which was developed in Chapter 11. If a is a critical 
point of f, then, according to Theorem 11-5, the function f has a local mini- 
mum at aif f’’(@) > 0, and a local maximum at a if (α) < 0. If f(a) = Ono 
conclusion was possible, but it is conceivable that the sign of f’’’(a) might give 
further information; and if f(a) = 0, then the sign of f“ (a) might be signifi- 


(b) n even 


FIGURE 2 


THEOREM 2 


PROOF 


Approximation by Polynomial Functions 339 


cant. Even more generally, we can ask what happens when 


(*») f@ Ξ (αὖ ξΞ --: =f" %@ Ξ 0, 
f(a) ~ 0. 


The situation in this case can be guessed by examining the functions 


F(x) = (x ΣΙ a)”, 
g(x) = — (x "" a)", 


which satisfy (*). Notice (Figure 2) that if n is odd, then a is neither a local 
maximum nor a local minimum point for f or g. On the other hand, if 7 is 
even, then /, with a positive nth derivative, has a local minimum at a, while g, 
with a negative nth derivative, has a local maximum at a. Of all functions 
satisfying (*), these are about the simplest available; nevertheless they indicate 
the general situation exactly. In fact, the whole point of the next proof is that 
any function satisfying (*) looks very much like one of these cpcnons, ina 
sense that is made precise by Theorem 1. 


Suppose that 
Las τ τον ΞῸ, 
f™(a) 5 0. 


(1) If nis even and f(a) > 0, then f has a local minimum at a. 
(2) If nis even and f(a) < 0, then f has a local maximum at a. 
(3) Ifn is odd, then f has neither a local maximum or minimum at a. 


There is clearly no loss of generality in assuming that f(a) = 0, since neither 
the hypotheses nor the conclusion are affected if f is replaced by f — f(a). 
Then, since the first n — 1 derivatives of f at a are 0, the Taylor polynomial 


Pra Of f is 
Pra(x) = f(a) εὐ ΟΠ 
= fe) (x — a)". 


ἐν τ: ἐξ ων 


Thus, Theorem 1 states that 


f(x) - Paal) _ lim [-Ξ: sar ia 


0 = li 
a (x - a)” πὶ 


2a (x = a)” 
Consequently, if x is sufficiently close to a, then 


f(x) 

(x — a)” 

Suppose now that n is even. In this case (x — a)” > 0 for all x ¥ a. Since 
f(x)/(x — a)” has the same sign as f")(a)/n! for x sufficiently close to a, it 


has the same sign as 


f(a), 
n! 


340 Infinite Sequences and Infinite Series 


FIGURE 3 


THEOREM 3 


PROOF 


follows that f(x) itself has the same sign as f"(a)/n! for x sufficiently close to 
a. If f(a) > 0, this means that 


f(x) > 0 = fl@) 
for x close to a. Consequently, f has a local minimum at a. A similar proof 
works for the case f(a) < 0. 
Now suppose that n is odd. The same argument as before shows that if x is 
sufficiently close to a, then 


F(x) 
(x — a)” 
But (x — a)" > 0 for x > a and (x — a)” < 0 for x < a. Therefore f(x) has 


different signs for x > a and x < a. This proves that f has neither a local maxi- 
mum nor a local minimum at a. ἢ 


always has the same sign. 


Although Theorem 2 will settle the question of local maxima and minima 
for just about any function which arises in practice, it does have some theo- 
retical limitations, because f‘”’(a) may be 0 for all k. This happens (Figure 
3(a)) for the function 
e ue x x 0 


ΠΤ 225 


which has a minimum at 0, and also for the negative of this function (Figure 
3(b)), which has a maximum at 0. Moreover (Figure 3(c)), if 


eat x> 0 
f(x) = 49, x= 0 
—e Uz" x < 0, 


then f(0) = 0 for all 4, but f has neither a local minimum nor a local 
maximum at 0. | 
The conclusion of Theorem 1 is often expressed in terms of an important 
concept of “‘order of equality.’ Two functions f and g are equal up to order n 
at a if 
fn £0) — 8@) — 9 


za (x = a)” 


In the language of this definition, Theorem 1 says that the Taylor polynomial 
Prat equals f up to order n at a. The Taylor polynomial might very well have 
been designed to make this fact true, because there is at most one polynomial 
of degree < n with this property. This assertion is a consequence of the follow- 
ing elementary theorem. 


Let P and Q be two polynomials in (x — a), of degree < n, and suppose that 
P and Q are equal up to order n at a. Then P = Q. 


LettR=P- Q. Since R is a polynomial of degree < n, it is only necessary to 


COROLLARY 


PROOF 


Approximation by Polynomial Functions 341 


prove that if 
R(x) = bo t+ +--+ t+bate — a)” 
satisfies 
᾿ς Ξ ἢ 
z—a (x -"" a)” 
then R = 0. Now the hypotheses on 2 surely imply that 


R(x) 
lim 
za (x <— a)* 


= 0 forO<i<n. 


For ὦ = 0 this condition reads simply lim R(x) = 0; on the other hand, 
lim R(x) = lim [bo + dix — a) + ++ + + bale — @)"] 


= bo. 
Thus 6) = 0 and 


Rx) = bie — a) + + + bane — a)". 


Therefore, 
AO) = μα τ + ἐν fC, ie) i 
and 
fa aig 
ra X — a 


Thus δι = 0 and 
R(x) = box — a)? +--+ - + δ)ιριία — a)”. 
Continuing in this way, we find that 


bo = oe eal ts = ῥ, = 0. 


Let f be n-times differentiable at a, and suppose that P is a polynomial in 
(x — a) of degree < n, which equals f up to order n at a. Then P = Pry. 


Since P and P,,,, both equal f up to order n at a, it is easy to see that P equals 
Pr up to order n at a. Consequently, P = Py. by the Theorem. ἢ 


At first sight this corollary appears to have unnecessarily complicated 
hypotheses; it might seem that the existence of the polynomial P would auto- 
matically imply that f is sufficiently differentiable for P,,, to exist. But in fact 
this is not so. For example (Figure 4), suppose that 


joxe x"tl χ irrational 
0, x rational. 


If P(x) = 0, then P is certainly a polynomial of degree < n which equals f up 
to order n at 0. On the other hand, f’(a) does not exist for any a ¥ 0, so f’’(0) 
is undefined. 


342 Infinite Sequences and Infinite Series 


᾿ 1 x irrational 


0, «x rational 


FIGURE 4 


When f does have n derivatives at a, however, the corollary may provide a 
useful method for finding the Taylor polynomial of /. In particular, remember 
that our first attempt to find the Taylor polynomial for arctan ended in failure. 


The equation 
<a 
arctan x = ——— dt 
ο 1 -Ὁ 1 


suggests a promising method of finding a polynomial close to arctan—divide 
1 by 1 + #, to obtain a polynomial plus a remainder: 
1 (— 1). 1252 
αν ὦ δ ete Se en's -1)ῖἔὌθη).  ------ς-.-. 
ery Ἔ ΞΕ πίω" 5 1+ 2 


This formula, which can be checked easily by multiplying both sides by 1 + ἐξ, 
shows that 


z x ponte 
t = Le ee a ee = 1) Peat = saa | dt 
arctan x [ + + (-1) + (—1) tLe 
x3 χὔ ont x 25:2 
ἘΠῚ ee ees αν τος tes —1)” {-- i | nf 
δ μὰ. αι ποτὰ eee 


According to our corollary, the polynomial which appears here will be the 
Taylor polynomial of degree 2n + 1 for arctan at 0, provided that 


x ponte 
ear 
᾿ 01+ # 
lim antl ἘΣ 
z—0 x 


Since 


1 ponte Pd | ix| 278 
᾿ Hat] Ξε} [ 5 Ὲ2 46} = ᾿ 
o1l+en | 0 | 25- 3 


this is clearly true. Thus we have found that the Taylor polynomial of degree 
2n + 1 for arctan at 0 is 


n+1 


Peg 


3 5 2 
Prngsa(t) = x — D+ - ἀν ee + (-—1)” Χ 


Approximation by Polynomial Functions 343 


By the way, now that we have discovered the Taylor polynomials of arctan, it 
is possible to work backwards and find arctan“? (0) for all &: Since 


pert 


2n + 1 


3 


a, ius 
Pon41,0(%) = cate ἘΠ 


and since this polynomial is, by definition, 


(2) / (2n-+1) 
arctan“? (0) κι... 4 arctan Ogee, 


(0) (1) πε τες ες τεσ το Τα 
arctan’ (0) + arctan‘? (0)χ + τ (Qn +1)! , 


we can find arctan“ (0) by simply equating the coefficients of x* in these two 
polynomials: 


abe 0) 29: See neve 
=| 3 


k} 
(21+ 1) —1)! 
cacac ah : (0) = ὑπ or δτοίϊδη 2 Ὁ 9.(0) = {--Ἠ1}} - (2)!. 
(2 + 1)! Id fa ae 
A much more interesting fact emerges if we go back to the original equation 
x3 x5 enti x p2nt+2 
arctan x = x ~—+—-—--+-+ +(-1!1)" + (ye f dt, 
37 5 ( ἘΠ on) ὅ Mg a 


and remember the estimate 


x Ἔ Ι,,.12π 
I ἐπ" ὡ dt | < ees 

ol+? f(g ae, 
When |x| < 1, this expression is at most 1/(2n + 3), and we can make this 
as small as we like simply by choosing n large enough. In other words, for 
|x! < 1 we can use the Taylor polynomials for arctan to compute arctan x as accurately 
as we like. The most important theorems about Taylor polynomials extend this 
isolated result to other functions, and the Taylor polynomials will soon play 
quite a new role. The theorems proved so far have always examined the 
behavior of the Taylor polynomial P,.,4 for fixed n, as x approaches a. Hence- 
forth we will compare Taylor polynomials P,,,. for fixed x, and different π. In 
anticipation of the coming theorem we introduce some new notation. 

If f is a function for which P,,,(x) exists, we define the remainder term 

Rija(x) by 


f(x) = Pna(x) + Rna(x) - 
= fla) tflla(x-—a+te+- + re (x — a)” + Rra(x). 


We would like to have an expression for R,,,4(x) whose size is easy to estimate. 
There is such an expression, involving an integral, just as in the case for arctan. 
One way to guess this expression is to begin with the case 2 = 0: 


f(x) = f@ + Roalx). 


344 Infinite Sequences and Infinite Series 


The Fundamental Theorem of Calculus enables us to write 


f(x) = fla) + fF FO at, 
so that 


Roale) = [ἡ ΓΟ at. 


A similar expression for R;,,(x) can be derived from this formula using inte- 
gration by parts in a rather tricky way: Let 


u(t) = f(t) and v(t)=t—x 


(notice that x represents some fixed number in the expression for v(t), so 
v(t) = 1); then 


[roa 


[Pr @+1 as 
dh 3a 
u(t) v'(t) 
u(t)v(t) ΐ - f° £0 (t : x) dt. 


u'(t) v(t) 


Since v(x) = 0, we obtain 


fx) = fa) + [Pro a 
= f(a) — ula)o(a) + [“ 76 — ὃ αἱ 
= fla) + fae -- a) + ff’ — ἢ αι. 
Thus 
ἄμ) = fo fl — ὃ at 


It is hard to give any motivation for choosing v(t) = ¢ — x, rather than 
v(t) = t. It just happens to be the choice which works out, the sort of thing one 
might discover after sufficiently many similar but futile manipulations. How- 
ever, it is now easy to guess the formula for Ro.(x). If 


u(t) = f(t) and o(t) = a 


then v’(t) = (« — ἢ), so 


[roe oa =uool - [Pro a 


_ f'(a)(« — a)? a a) 
= 5 +f » (x -- t)? αἱ. 


This shows that 
xz (3) 
Roa(x) = [ ΓΦ (x -- t)? dt. 


You should now have little difficulty giving a rigorous proof, by induction, 


THEOREM 4 (TAYLOR’S THEOREM) 


Approximation by Polynomial Functions 345 


that if f‘"*” is continuous on [a, x], then 
2 ¢(n+1) 
ΓΞ | aia Ἔ 
α n. 


From this formula, which is called the integral form of the remainder, it is 
possible (Problem 13) to derive two other important expressions for Ry,a(x): 
the Cauchy form of the remainder, 


fo (t) 
πὶ 


Raalx) = (x — t)"(x — a) for some ¢ in (a, x), 


and the Lagrange form of the remainder, 


forty (t) 
(n+1)! 


In the proof of the next theorem (Taylor’s Theorem) we will derive all three 
forms of the remainder in an entirely different way. One virtue of this proof 
(aside from its cleverness) is the fact that the Cauchy and Lagrange forms of 
the remainder will be proved without assuming the extra hypothesis that 
f+ is continuous. In this way Taylor’s Theorem appears as a direct general- 
ization of the Mean Value Theorem, to which it reduces for n = 0, and which 
is the crucial tool used in the proof. 

These remarks may suggest a strategy for proving Taylor’s T’heorem. 
Since R,a(a) = 0, we might try to apply the Mean Value Theorem to the 
expression 


R,,.a(x) = (x — αὐ} for some ¢ in (a, Δ). 


Κι, «(α) = Rnalx) ἘΞ Κι, α(α) 


χπα χπα 


On second thought, however, this idea does not look very promising, since it 
is not at all clear how f*(t) is ever going to be involved in the answer. 
Indeed, if we take the most straightforward route, and differentiate both sides 
of the equation which defines R,,., we obtain 


(Ma 
γῷ -Ι πώς = a) τ. FARE ea + Rael, 
which is useless. The proper application of the Mean Value Theorem has a 
lot in common with the integration by parts proof outlined above. This proof 
involved the derivative of a function in which x denoted a number which was fixed. 
This is just how x will be treated in the following proof. 


Suppose that f’, . . . , f+” are defined on [a, x], and that Ry,«(x) is defined 
by 


(Ma 
fla) = fla) πῶς - ὦ + + Ὁ 95 α — a) + Ral. 


n! 


346 Infinite Sequences and Infinite Series 


PROOF 


Then 
(1) Raa) = atl, ak a (x — t)"(x — a) for some ¢ in (a, x). 
(2) Rrolx) = rat (x — α) 71 for some ἐ in (a, x). 


Moreover, if f(t is integrable on [a, x], then 
a {(n+1) 
3) Rnals) = [9 ἃ - at 
a ni: 


(If x < a, then the hypothesis should state that fis (n + 1)-times differentiable 
on [x, a]; the number ἐ in (1) and (2) will then be in (x, a), while (3) will 
remain true as stated, provided that ft” is integrable on [x, a].) 


For each number ¢ in [a, x] we have 


(n) (y 
710) =fO+fPO@—-)+---45o2 - πὰ 


τ ὦ ΞΕ Rn, p(x). 


Let us denote the number R,,:(x) by S(¢); the function S is defined on [a, x], 
and we have 


(n) 
(*) fix) =fO+fO(e-O+--- ἐξ ϑ 


(x — δ᾽ + S(t) 
for all ¢ in [a, x]. 


We will now differentiate both sides of this equation, which asserts that the 
function whose value at ¢ is f(x), equals the function whose value at ¢ is 


fo ++ +O wor 4-50. 


(In common parlance we are considering both sides of (*) “‘as a function of ¢.’’) 
Just to make sure that the letter + causes no confusion, notice that if 


g(t) = f(x) for all ¢, 


then 
g(t) = 0 for all ¢; 
and if 
FO 
g(t) = Te ἀξ", 
then 
“Ὁ = FO ee — p+ 0 & — 9 
= * 909 (x _ pes {1} 00} f' a e _ pt, 


«=p 


Approximation by Polynomial Functions 347 


Applying these formulas to each term of (*), we obtain 


a fl! (3) 
0-70 +[-70 +E 9 @- 9] ε{Ξϑς -9 + FO @- ο 


f' “Ὁ 


ne Ξ τ) η-| 
ποθ ἀπῆν a 


In this beautiful formula practically eveything in sight cancels out, and we 
obtain 


α πὸ] +S". 


(n+) (1 
(i) = -- ἕως (x -- ἢ". 
n} 
Now we can apply the Mean Value Theorem to the function S on |a, x]: there 
is some ¢ in (a, x) such that 


S(x) — S{a) 


X —@& 


a 5(ὃ ee ΣΝ Ὁ) 
η!} 


(x — ἢ", 
Remember that 

S(t) = Κα, 0); 
this means in particular that 


S(x) = Raz) = 0, 
Sa) = Ry «(). 


Thus 
0 — Raolt) FPO GL yn 
x—-a πὶ 
or 
n+1) 
hoa Peg eG aia) 


this is the Cauchy form of the remainder. 
To derive the Lagrange form we apply the. Cauchy Mean Value Theorem 
to the functions § and g(t) = (x — δ) 71: there is some ἐ in (a, x) such that 


0 ee 
S(x) — S(a) a σ΄ n! 
g(x)— g(a) “ὦ = —(n + 1)(@@— ὃ" 
Thus 
Rralt) μὴ 
α -- ay" ~ @ $1)! 
or 


a {στ ¢)) _ Η 1 
Ria) Ga il (x a) πε 


which is the Lagrange form. 


348 Infinite Sequences and Infinite Series 


Finally, if f+? is integrable on [a, x], then 


S(x) — S(a) = [ s(t) = 


Rnalx) = [ ie (x — t)" dt. ἢ 


or 


- [OP enna 


Although the Lagrange and Cauchy forms of the remainder are more than 
theoretical curiosities (see, e.g., Problem 22-16), the integral form of the 
remainder will usually be quite adequate. If this form is applied to the func- 
tions sin, cos, and exp, with a = 0, Taylor’s Theorem yields the following 


formulas: 
; x? x? 
S = — - — 
in x x a1 By 
x x! 
Sa 
x? 
eS ie ee 


1 enti 
aca 
sa Θς ae 
“en eepg ttn (t) ; 
: —1)” ee ye ae 4135: i 
a(t) On)! Gn)! = od, 


ἘΞ τ [Sorat 
n! on! 


To evaluate any of these integrals explicitly would be supreme foolishness— 
the answer of course will be exactly the difference of the left side and all the 
other terms on the right side! To estzmate these integrals, however, is both easy 


and worthwhile. 


The first two integrals are especially easy. Since 


we have 


Since 


[ CH=) hd = 
0 


we conclude that 


! 


x sin (27 +2) (t) 


: (n rf 1)! Sr ἢ)3::1 dt 


ΠῚ Oe 
(2n +1)! nr § 
Similarly, we can show that 


cos 25:1) (t) 


[σα 


sin?" (Ὁ) « 1. forall: ἐ, 


lA 


(2n : τ} δ. 


—(x ey {)2nte t=2 
an + 2 t=0 
ante 
-.ς. 
-- aaa at| < wee: Ξ 
~ (2n + 2)! 
n+ 
(x —_ i)**" it| < ean 1 


~ (n+ 1)! 


Approximation by Polynomial Functions 349 


These estimates are particularly interesting, because (as proved in Chapter 16) 


for any ¢ > 0 we can make 
n 


x 
=e -- 

n} 
by choosing n large enough (how large n must be will depend on x). This 
enables us to compute sin x to any degree of accuracy desired simply by 
evaluating the proper Taylor polynomial P,,o(x). For example, suppose we 
wish to compute sin 2 with an error of less than 107%. Since 


; 2252 
sin 2 = Ponyi,0(2) + R, where RI < Gn FD)! 
we can use Pon41,9(2) as our answer, provided that 
an+2 
hem < 1074, 
(2n + 2)! 


A number n with this property can be found by a straightforward search— 
it obviously helps to have a table of values for n! and 2” (see page 356). In this 
case it happens that nm = 5 works, so that 


sin 2 = P11,0(2) +- R 
23 25 21 29 211 
-.2- -- τς --- τ -- ---- HR, 
ai 5 7 91 iit 
where [ἢ < 107%. 
It is even easier to calculate sin 1 approximately, since 
1 
in 1 = Pop, 1 R, where [ἢ < ———— 
sin on+1,0(1) + RI (on + 2)! 
To obtain an error less than ξ we need only find an n such that 
Sass te a 
(2n+2)! 


and this requires only a brief glance at a table of factorials. (Moreover, the 
individual terms of P2,41,0(1) will be easier to handle.) 
For very small x the estimates will be even easier. For example, 


ee oil 1 1 

sin = Pon 41,0 (:) + R, where |R| < 10™72(2n Ὁ at 
To obtain |R| « 107 18 we can clearly take n = 4 (and we could even get away 
with n = 3). These methods are actually used to compute tables of sin and 
cos. A high-speed computer can compute P2n41,0(x) for many different x in 
almost no time at all. 

Estimating the remainder for ¢ is only slightly harder. For simplicity 

assume that x > 0 (the estimates for x < 0 are obtained in Problem 9). On 


350 Infinite Sequences and Infinite Series 


the interval [0, x] the maximum value of δ' is e”, since exp is increasing, so 


errr 


[<6 gna s =f t)” dt 

—— _— — XS = 

o n! ~ nt Jo (n + Gat. 
Since we already know that e < 4, we have 


erxntl Atyntl 


GED! Gait 


which can be made as small as desired by choosing n sufficiently large. How 
large n must be will depend on x (and the factor 4” will make things more 
difficult). Once again, the estimates are easier for small x. If 0 < x < 1, then 


x? x" 4 
Pee aoe ἢ Eo eae where 0 < π 1! 


(The inequality 0 < R follows immediately from the integral form for R.) 
In particular, ifn = 4, then 


4 4 
ἜΣ 2 Ὁ 
5! 10 
SO 
= ΒΝ τ + R Shee Ree 
“Fe a ay 41. Ὁ 10 
65 
Eade - 
74° 
17 
τὸ ee a δ 
+o + 


which shows that 
2 «ε «3. 


(This allows us to improve our estimate of R slightly: 


ΖΦ, ἘΠΕῚ 
0<R<——.) 


(n + 1)! 
By taking n = 7 you can compute that the first 3 decimals for 6 are 
e=2.718... 


(you should check that n = 7 does give this degree of accuracy, but it would 
be cruel to insist that you actually do the computations). 

The function arctan is also important but, as you may recall, an expression 
for arctan“*)(x) is hopelessly complicated, so that the integral form of the 
remainder is useless. On the other hand, our derivation of the Taylor poly- 
nomial for arctan automatically provided a formula for the remainder: 


elie ant+1 Co) ea 
dt 
2η -ἘΊ +f tr ἘΝ 


Xx 
so al ls Se Ta 


Approximation by Polynomial Functions 351 


As we have already estimated, 


[ Can a aa Ee τ 3:7 nea, eee ee 

1:8)" ss -Σ +3 
For the moment we will consider only numbers x with |x| < 1. In this case, 
the remainder term can clearly be made as small as desired by choosing n 
sufficiently large. In particular, 
{Ξ 1)" 1 


+R, where |R| < 
2n + 1 an 3 


| ess | 
ican l= 2 a ee 
arctan 375 


With this estimate it is easy to find an n which will make the remainder less 
than any preassigned number; on the other hand, n will usually have to be so 
large as to make computations hopelessly long. To obtain a remainder 
<107~‘4, for example, we must take n > (104 — 3)/2. This is really a shame, 
because arctan 1 = 7/4, so the Taylor polynomial for arctan should allow us 
to compute 7. Fortunately, there are some clever tricks which enable us to 
surmount these difficulties. Since 


leas 


|Ron41,0(x)| < on 4b 3’ 
much smaller n’s will work for only somewhat smaller x’s. The trick for com- 
puting π is to express arctan 1 in terms of arctan x for smaller x; Problem 5 
shows how this can be done in a convenient way. 

The Taylor polynomial for the function f(x) = log(x + 1) at a = 1 is best 
handled in the same manner as the Taylor polynomial for arctan. Although 
the integral form of the remainder for f is not hard to write down, it is difficult 
to estimate. On the other hand, we obtain a simple formula if we begin with 
the equation 


1 Cap. 
———_ --͵Ί{λὴλΚπ-- | t? — ἀλγῶ xe 1)π5π| : 
1-Ὁ εἰ ἐδ δι. εὐ Ὁ τὰ La 
this implies that 
log(1 + x) [ ΘΕ τ 4 (a1 
= ——_——— i -- -- 3 -- --- 
5 oes, ae 
zx {* 
—1)” dt, 
+(=1)" [a 


for allx > —1.If« > 0, then 


z jn z rt 
[ as [rd = ) 
oitl 0 n+1 


and there is a slightly more complicated estimate when —1 < x < 0 (Problem 
10). For this function the remainder term can be made as small as desired by 
choosing n sufficiently large, provided that --1 <x <1. 


352 Infinite Sequences and Infinite Series 


The behavior of the remainder terms for arctan and f(x) = log(x + 1) is 
quite another matter when |x| > 1. In this case, the estimates 


2n+3 
|Rongi,0(x)| < re : for arctan, 
ΧΕΙ 
|Rn,o(x)| < et (x > 0) for Ἴ, 
ῃ 


are of no use, because when |x| > 1 the bounds x”/m become large as m 
becomes large. This predicament is unavoidable, and is not just a deficiency 
of our estimates. It is easy to get estimates in the other direction which show 
that the remainders actually do remain large. To obtain such an estimate for 
arctan, note that if ¢ is in [0, x] (or in [x, 0] if « < 0), then 


1te<1+x? < 2x’, if |x| > 1, 


[ ponte 4 ": ea “ ἴα 4 _ Clason 
ο 1 - 2 τ“ 2x7! Jo 


4n + 6 
Similarly, if x > 0, then for ¢ in [0, x] we have 
1+t<it+x*<2x, ifx > 1, 


[ : a> > [αι = Bee 
ot+1 2x JO 2n-+ 2 


These estimates show that if |x| > 1, then the remainder terms become large 
as n becomes large. In other words, for |x| > 1, the Taylor polynomials for 
arctan and f are of no use whatsoever in computing arctan x and log(x + 1). This is 
no tragedy, because the values of these functions can be found for any x once 
they are known for all x with |x| < 1. 

This same situation occurs in a spectacular way for the function 


sO 


50 


e uz" x #0 


fos) = {5 x=0. 


We have already seen that f(0) = 0 for every natural number ἀ. This means 
that the Taylor polynomial P,,o for f is 


77 (n) 
Pr,o(x) = f(0) + f’(0)x +o 2 +++. sf Ὁ x” 


= 0. 


In other words, the remainder term R,,o(x) always equals f(x), and the Taylor 
polynomial is useless for computing f(x), except for x = 0. Eventually we will 
be able to offer some explanation for the behavior of this function, which is 
such a disconcerting illustration of the limitations of ‘Taylor’s Theorem. 

The word “‘compute”’ has been used so often in connection with our esti- 
mates for the remainder term, that the significance of Taylor’s Theorem might 


THEOREM 5 


PROOF 


Approximation by Polynomial Functions 353 


be misconstrued. It is true that Taylor’s Theorem is an almost ideal computa- 
tional aid (despite its ignominious failure in the previous example), but it has 
equally important theoretical consequences. Most of these will be developed 
in succeeding chapters, but two proofs will illustrate some ways in which 
Taylor’s Theorem may be used. The first illustration will be particularly 
impressive to those who have waded through the proof, in Chapter 16, that 
π᾿ is irrational. 


δ is irrational. 


We know that, for any ἢ, 


3 


1 1 1 
= a Pa ——— . ὁ 6 ————* TY G 
"4 6 ΣΤ ay rt Ras WE es ae i)! 


Suppose that ¢ were rational, say ὁ = a/b, where a and ὁ are positive integers. 
Choose n > 6 and also n > 3. Then 


18 


50 


n! | n! 
nitnit—+te-:+ +—4+aIRy. 
2! n! 


Every term in this equation other than n!R, is an integer (the left side is an 
integer because n > δ). Consequently, n!R, must be an integer also. But 


3 
0< Rk, «-----: 
(n+ 1)! 
SO 
3 3 
(<n. 1, 
" "ἢ ΤΑ 


which is impossible for an integer. ἢ 


The second illustration is merely a straightforward demonstration of a fact 
proved in Chapter 15: If 


Pay =, 
ΚΟ) = 0, 
f'(0) = 9, 


then f = 0. To prove this, observe first that {® exists for every *; in fact 


fo = (f= fy 
Κ = (Py! = (-f) = -ῦ =f, 
1.5 = (fi)! = { 


etc. 


354 Infinite Sequences and Infinite Series 


This shows, not only that all exist, but also that there are at most 4 different 
ones: ἡ, Κ΄, —f, —f’. Since f(0) = f’(0) = 0, all f(0) are 0. Now Taylor’s 
Theorem states, for any n, that 


(n-+1) 
" t 
f(x) = [ iia (x — δ) dt. 
0 n! 
Each function f* is continuous (since f+? exists), so for any particular x 
there isa number M such that 
[fT ()| < M for 0 <t <x, and all n 


(we can add the phrase “and ail n” because there are only four different f™). 
Thus 
_ M\x\"t} . 

(n + 1)! 


fey) sm] [ἘΞ 9" a 


Since this is true for every n, and since x"/n! can be made as small as desired 
by choosing ἡ sufficiently large, this shows that | f(x)| « ¢ for any ξ > 0; conse- 
quently, f(x) = 0. 

The other uses to which Taylor’s Theorem will be put in succeeding chap- 
ters are closely related to the computational considerations which have con- 
cerned us for much of this chapter. If the remainder term R,,.,(x) can be made 
as small as desired by choosing n sufficiently large, then f(x) can be computed 
to any degree of accuracy desired by using the polynomials P,.a(x). As we 
require greater and greater accuracy we must add on more and more terms. 
If we are willing to add up infinitely many terms (in theory at least!), then 
we ought to be able to ignore the remainder completely. There should be 
“infinite sums’ like 


Ι 
| 
| 
| 
| 


sin x 


Ἅ1 9} Ὑ7} 
cos x = -ξτΣ -Ξ a ; 
a Σ ἕξ εξ τύ 
arctans τὰ ππ Ἐπ τε ον if |x| < 1, 
log(t +x) = 2-5 ἘΞ εν, if -1 <x <1. 


We are almost completely prepared for this step. Only one obstacle remains 
——we have never even defined an infinite sum. Chapters 21 and 22 contain the 
necessary definitions. 


Approximation by Polynomial Functions 355 


PROBLEMS 


1. Find the Taylor polynomials (of the indicated degree, and at the indi- 
cated point) for the following functions. 
(i) f(x) = e*; degree 3, at 0. 
(ii) f(x) = 45. degree 3, at 0. 
(iii) sin; degree 2n, at > 
(iv) cos; degree 2n, at 7. 
(v) exp; degree n, at 1. 
(vi) log; degree n, at 2. 
(vii) f(x) = x5 + x3 + x; degree 4, at 0. 
(viii) f(x) = x6 + x3 + x; degree 4, at 1. 


; degree 2n + 1, at 0. 


ΝΕ 
Gx) 2 => 


(x) f(x) = 


' ᾿ degree n, at 0. 


2. Write each of the following polynomials in x as a polynomial in (x — 3). 
(It is only necessary to compute the Taylor polynomial at 3, of the same 
degree as the original polynomial. Why?) 


Gi) x*— 4x — 9. 

(ii) x4 — 12x? + 44x? + 2x + 1. 
(iii) χὅ. 

(iv) ax? -+ bx +c. 


3. Write down a sum (using 2 notation) which equals each of the following 
numbers to within the specified accuracy. To minimize needless com- 
putation, consult the tables for 2” and n! on the next page. 


(i) sin 1; error < 107”. 
(ii) sin 2; error < 1013 
(iii) sin 4; error < 107”. 
(iv) e; error < 107‘. 


(v) 6452; error < 107°. 


356 Infinite Sequences and Infinite Series 


“4. This problem is similar to the previous one, except that the errors 
demanded are so small that the tables cannot be used. You will have to 
do a little thinking, and in some cases it may be necessary to consult the 
proof, in Chapter 16, that x”/n! can be made small by choosing n large— 
the proof actually provides a method for finding the appropriate n. In 
the previous problem it was possible to find rather short sums; in fact, it 
was possible to find the smallest n which makes the estimate of the 
remainder given by Taylor’s Theorem less than the desired error. But in 
this problem finding any specific sum is a moral victory (provided you 


5. 


Prema chee fk peepee 
SANAINNAP WN KH OO MOAI A MN AW pp μα 


ho 
me) 


65,536 
131,072 
262,144 
524,288 


1,048,576 


24 

120 

720 
5,040 
40,320 
362,880 


3,628,800 
39,916,800 
479,001,600 
6,227,020,800 


87,178,291,200 
1,307,674,368,000 
20,922,789,888,000 
355,687,428,096,000 
6,402,373,705,728,000 
121,645,100,408,832,000 
2,432,902,008,176,640,000 


can demonstrate that the sum works). 


(i) 
(ii) 
(iii) 
(iv) 
(v) 


sin 1; error < 1070"), 


e; error < 1071000. 


sin 10; error < 10722, 


e1°: error < 10 80, 


arctan 745; error < 10 (105). 


(a) Prove, using Problem 15-8, that 


BIN SIA 


1 1 
arctan — arctan —> 
2 T 3 


1 


1 
= 4 arctan -- — arctan ——- 
5 9 


23 


Rn,o(x) = 


Approximation by Polynomial Functions 357 


(Ὁ) Show that 7 = 3.14159 . . . . (Every young man should verify a 
few decimals of a for himself, but the purpose of this exercise is not 
to set you off on an immense calculation. If the second expression in 
part (a) is used, the first 5 decimals for 7 can be computed with 
remarkably little work.) 

For every number a, and every nonnegative integer n, we define the 

“binomial coefficient”’ 


(*) - 55 - " Ἐπ 5} 


n n! 


a\. ἐς ον ας 
{ἃ is not an integer, then ( ) is never 0, and alternates in sign for 
n 


n > a. Show that the Taylor polynomial of degree n for f(x) = (1 + x) 


at 0 is P, o(x) = Σ (“) x*, and that the Cauchy and Lagrange forms 


of the remainder are the following: 
Cauchy form: 
α(α -- 1) "τ: (a—n) 


x(x —2"1 +a") 


n! 


. α(α -- 1) " " (α -- π) χα - δ“: (Ξ- 


n} 1+ 
= (n+ 1) ( ie ; x(1 + 4°77} (=), tin [0, x] or [x, 0]. 
Lagrange form: 
ee sen en etd, 4 yee 
= (, : ar + #)°-"-1, tin (0, x] or [x, 0]. 


Estimates for these remainder terms are rather difficult to handle, and 
are postponed to Problem 22-16. 

Suppose that a; and ὁ; are the coefficients in the Taylor polynomials at 
a of f and g, respectively. In other words, a; = f(a)/i! and 5; = 
g(a)/t!. Find the coefficients c; of the Taylor polynomials at a of the 
following functions, in terms of the a,’s and 3;’s. 


ὦ) .f Fg: 

(ii) fg. 

(iii) f”. 

(iv) h(x) = [ “ Κὸ αἱ. 
(v) 10) = f° fo) at. 


358 Infinite Sequences and Infinite Series 


*8. 


10. 


*11. 


12. 


13. 


14. 


(a) Prove that the Taylor polynomial of f(x) = sin(x?) of degree 
4n-+ 1 at Oils 


αν Se Tiny eet ecg) ) ft =a n ἢ Α 
δι" AG 


Hint: If P is the Taylor polynomial of degree 2n + 1 for sin at 0, 
then sin x = P(x) + R(x), where lim R(x) /x?"t! = 0. What does 
this imply about lim R(x?) Δ τὶ > 


(b) Find f® (0) for all 7 

(c) In general, if f(x) = g(x”), find f™(0) in terms of the derivatives of 
g at 0. 

Prove that if x < 0, then 


re α τ λα] < 


Prove that if —1 « x < 0, then 


z κ.5 i [15 Ὁ1 
τ τ Ξ ΠΣ ΕΑ 


(a) Show that if |g’(x)| < M|x -- αἰ" for |x -- αἰ < 6, then |g(x)| < 
M\x — al"™1/(n + 1) for |x -- αἱ < ὃ. 
(Ὁ) Use part (a) to show that if lim g’(x)/(x — a)” = 0, then 


lim g(x)/(x — a)"t1 = 0. 


(c) Show thatifg(x) = f(x) — Pyia.y(x), then g’(x) = f’(x) — Pa—1,0,7/(*). 

(4) Give an inductive proof of Theorem 1, without using l’Hépital’s 
Rule. | 

Deduce Theorem 1 as a corollary of Taylor’s Theorem, with any form 

of the remainder. (The catch is that it will be necessary to assume one 

more derivative than in the hypotheses for Theorem 1.) 

Deduce the Cauchy and Lagrange forms of the remainder from the 

integral form, using Problems 13-27 and 13-28. There will be the same 

catch as in Problem 12. 

(a) Prove that if f’’(a) exists, then 


eth + fla — h) — 2f(a) 


f(a) = tim x 


Hint: Use the Taylor polynomial of degree 2 with x = a + A and 
with x = a—A. | 


15. 


*16. 


sas fe 


Approximation by Polynomial Functions 359 


(b) Let f(x) = x? for x > 0, and —x? for x < 0. Show that 


fim £0. + 4) + ΧΟ — A) -- 2f(0) 


h-0 h? 


exists, even though /’’(0) does not. 
Use the Taylor polynomial P;,.,;, together with the remainder, to prove 
a weak form of Theorem 2 of the Appendix to Chapter 11: If f’’ > 0, 
then the graph of f always lies above the tangent line of /, except at the 
point of contact. 
Problem 17-32 presented a rather complicated proof that f = 0 if 
f’ — f = 0 and f(0) = f’(0) = 0. Give another proof, using Taylor’s 
Theorem. (This problem is really a preliminary skirmish before doing 
battle with the general case in Problem 17, and is meant to convince you 
that Taylor’s Theorem is a good tool for tackling such problems, even 
though tricks work out more neatly for special cases.) 
Consider a function f which satisfies the differential equation 


n—1 
(1) -- a; (j) 
f Z ἢ" 
for certain numbers ao, . . . , ᾳ..1. Several special cases have already 
received detailed treatment, either in the text or in other problems; in 
particular, we have found all functions satisfying f’ = f, or f’ + f = 0, 
or f’’ — f = 0. The trick in Problem 17-31 enables us to find many solu- 
tions for such equations, but doesn’t say whether these are the only 
solutions. This requires a uniqueness result, which will be supplied by this 
problem. At the end you will find some (necessarily sketchy) remarks 
about the general solution. 


(a) Derive the following formula for f("T” (let us agree that “‘a_,’’ will 
85 gt 
be 0): 


n—l 
ford = >» (aj-1 + an—1a,)f. 
720 


(Ὁ) Deduce a formula for f(t”, 


The formula in part (b) is not going to be used; it was inserted only to 
convince you that a general formula for f‘”** is out of the question. On 
the other hand, as part (c) shows, it is not very hard to obtain estimates 
on the size of κι 8) (x). 


(c) Let MN = max(1, Jao], . . . , [α,...[). Then |aj1 + an_ia;| < 2N?; 
this means that 


n—-1 
for -- > bf, where |b;!| < 2N?. 


j=0 


360 Infinite Sequences and Infinite Series 


(d) 


(e) 


(f) 


Show that 
n—1 
| ni _ > bf, where 6.3} « 4Ν3, 
j=0 


and, more generally, 


forth) = » 


n=l 
b Ff, where 5;*| < QkNet1 
j=0 


Conclude from part (c) that, for any particular number «, there is 
a number M such that 


| fot (x) | < ΑΝ. τὸν all &. 


Now suppose that f(0) = f/(0) = +--+ = 900) = 0. Show that 
M - QENE+1 y |e tk+1 
ea 
(n+k+ 1)! 
Μ' ΙΝ ΧΙ ΕΣ 
- (η ἘΠ 1 - 1)! 


and conclude that f = 0. 
Show that if f, and f, are both solutions of the differential equation 


n-l1 
fo τ ) at, 
j=0 


and f1(0) = fo(0) for 0 <j <n — 1, then fi = fe. 


In other words, the solutions of this differential equation are deter- 
mined by the “‘initial conditions” (the values f%(0) for 0 <j < 
n — 1). This means that we can find a// solutions once we can find 
enough solutions to obtain any given set of initial conditions. If the 


equation 
7% 


x — An—yx"™ 1 ΕΞ eS —a,=0 


has n distinct roots a1, . . . , ἄρ, then any function of the form 
f(x) = cet toe ee Cyern® 
is a solution, and 


f(0) oa i es a 
FO) =avrit:** + ela, 


{ 100) μερῶν αι 16, + oe e + fe Peat rn 


As a matter of fact, every solution is of this form, because we can 
obtain any set of numbers on the left side by choosing the c’s 


*18. 


(a 


(b) 


¥**(G) 


Approximation by Polynomial Functions 361 


properly, but we will not try to prove this last assertion. (It is a 
purely algebraic fact, which you can easily check for n = 2 or 3.) 
These remarks are also true if some of the roots are multiple roots, 
and even in the more general situation considered in Chapter 26. 
Let f(x) = x‘ sin 1/x? for x ¥ 0, and f(0) = 0. Show that f = 0 up 
to order 2 at 0, even though f’’(0) does not exist. | 


This example is slightly more complex, but also slightly more 
impressive, than the example in the text, because both f’(a) and 
f’’(a) exist for a ¥ 0. Thus, for each number a there is another num- 
ber m(a) such that 


m(a) 
2 


6) fle) = fla) + fae -- ὦ +™2 ἃ — a)? + Rale), 


where lim Pale) 
ra (x — a)? 


namely, m(a) = f’’(a) for a #0, and m(0) = 0. Notice that the 
function m defined in this way is not continuous. 


Let f be a differentiable function. Suppose that there is a function m 
such that (*) holds for all numbers a, and that m is continuous. 
Prove that f’’(a) exists for all a and equals m(a). Hint: Write (*) for 
x=ath,a=a,andalsoforx =aa=atA. 


Suppose that there is a continuous function m such that 
f(s) = fla) + Fale — 2) +E (& - a) + BO ea) 
+ Rails), 
neta Δ ΒΝ 


za \X π a)3 


Does it follow that f’’’(a) = m(a)? 


*CHAPTER 


6 IS TRANSCENDENTAL 


The irrationality of e was so easy to prove that in this optional chapter we will 
attempt a more difficult feat, and prove that the number e is not merely 
irrational, but actually much worse. Just how a number might be even worse 
than irrational is suggested by a slight rewording of definitions. A number x 
is irrational if it is not possible to write x = a/b for any integers a and ὁ, with 
ὃ # 0. This is the same as saying that x does not satisfy any equation 


bx —a = 0 


for integers a and ὁ, except for a = 0, ὁ = 0. Viewed in this light, the irra- 
tionality of V2 does not seem to be such a terrible deficiency; rather, it 
appears that V2 just barely manages to be irrational—although V2 is not the 


solution of an equation 
a\Xx + ago = 0, 


it zs the solution of the equation 
x? -- 2 = 0, 


of one higher degree. Problem 2-17 shows how to produce many irrational 
numbers x which satisfy higher-degree equations 


ἀγα fae te say Ὁ: 


where the a; are integers and ao # 0 (this condition rules out the possibility 
that all a; = 0). A number which satisfies an ‘‘algebraic’’ equation of this sort 
is called an algebraic number, and practically every number we have ever 
encountered is defined in terms of solutions of algebraic equations (πὶ and e 
are the great exceptions in our limited mathematical experience). All roots, 


such as 7 
M9. 973 7 


are clearly algebraic numbers, and even complicated combinations, like 
3 TO DD OOOF=EY 
V34V54Vi4V24% 


are algebraic (although we will not try to prove this). Numbers which cannot 
be obtained by the process of solving algebraic equations are called trans- 
cendental; the main result of this chapter states that e is a number of this 
anomalous sort. 

The proof that ὁ is transcendental is well within our grasp, and was theo- 
retically possible even before Chapter 19. Nevertheless, with the inclusion of 
this proof, we can justifiably classify ourselves as something more than novices 
in the study of higher mathematics; while many irrationality proofs depend 


362 


els Transcendental 363 


only on elementary properties of numbers, the proof that a number 15 transcen- 
dental usually involves some really high-powered mathematics. Even the 
dates connected with the transcendence of ὁ are impressively recent—the first 
proof that e is transcendental, due to Hermite, dates from 1873. The proof 
that we will give is a simplification, due to Hilbert. 

Before tackling the proof itself, it is a good idea to map out the strategy, 
which depends on an idea used even in the proof that 6 is irrational. ‘'wo 
features of the expression 


1 1 1 
peta a ee as 
were important for the proof that ὁ is irrational: On the one hand, the number 
1 1 
We irae ΡΜ Saar 
1! πὶ 


can be written as a fraction p/g with q < π' (so that n!(p/q) is an integer); 
on the other hand, 0 < Ry « 3/(n + 1)! (so n!R, is not an integer). These 
two facts show that e can be approximated particularly well by rational 
numbers. Of course, every number x can be approximated arbitrarily closely 
by rational numbers—if ¢ > 0 there is a rational number r with |x — r| < ς; 
the catch, however, is that it may be necessary to allow a very large denomi- 
nator for r, as large as 1/¢ perhaps. For ὁ we are assured that this is not the 
case: there is a fraction p/g within 3/(n + 1)! of e, whose denominator 4 is at 
most n!. If you look carefully at the proof that ¢ is irrational, you will see that 
only this fact about ¢ is ever used. The number ¢ is by no means unique in this 
respect: generally speaking, the better a number can be approximated by 
rational numbers, the worse it is (some evidence for this assertion is presented 
in Problem 3). The proof that ὁ is transcendental depends on a natural 
extension of this idea: not only e; but any finite number of powers 6, ¢’, . . . , 
οἷ, can be simultaneously approximated especially well by rational numbers. 
In our proof we will begin by assuming that ¢ is algebraic, so that 


(x) ane™ + “τ +aeta=0, a #9 
for some integers ao, . . . , dn. In order to reach a contradiction we will then 
find certain integers M, M;,..., Mn and certain ‘“‘small’? numbers 
ει, .. . , ἐμ such that 
1 Mi + εἰ 
εἶ τ -------- 
Μ 
2 M2 + €2 
63 = ------- 
Μ 
— Man + €n 


364 Infinite Sequences and Infinite Series 


THEOREM 1 


PROOF 


Just how small the e’s must be will appear when these expressions are sub- 
stituted into the assumed equation (*). After multiplying through by M4 we 
obtain 


laM+aM,+-°--+: +a,M,|) 4+ [lea:+ +--+: + €a,] = 0. 


The first term in brackets is an integer, and we will choose the M’s so that it 
will necessarily be a nonzero integer. We will also manage to find e’s so small 


that 
lai τ τ Endn| < 3; 


this will lead to the desired contradiction—the sum of a nonzero integer anda 
number of absolute value less than 4 cannot be zero! 

As a basic strategy this is all very reasonable and quite straightforward. 
The remarkable part of the proof -will be the way that the M’s and e’s are 
defined. In order to read the proof you will need to know about the gamma 
function! (This function was introduced in Problem 18-25.) 


ὁ is transcendental. 


Suppose there were integers ao, . . . , @n, with ao ¥ 0, such that 
(RP (ape Πρ ΟΣ τ΄ 0. 
Define numbers M@, M1, ...,M, and &, ... , €, as follows. 
o .p—!l as oe e — Dox 
Μ- | eee ἜΣ τ ἀπ 
0 (p -- 1)! 
© .p—l ae ef hie τὰ = Doz 
ἡ ες | od Slt aaa tome Be Ὁ 
k 6 oat OE 
k πὶ τος ees — Pox 
ne: | ae ie ee ie 
0 (pal)! 


The unspecified number p represents a prime number* which we will choose 
later. Despite the forbidding aspect of these three expressions, with a little 
work they will appear much more reasonable. We concentrate on M first. If 
the expression in brackets, 


(x — 1) τ @&—n)], 
is actually multiplied out, we obtain a polynomial 
xn -+ e 8 + n! 


* The term “‘prime number’’ was defined in Problem 2-16. An important fact about prime 
numbers will be used in the proof, although it is not proved in this book: If p is a prime 
number which does not divide the integer a, and which does not divide the integer 4, 
then pf also does not divide ab. The Suggested Reading mentions references for this theorem 
(which is crucial in proving that the factorization of an integer into primes is unique). We 
will also use the result of Problem 2-16(d), that there are infinitely many primes—the reader 
is asked to determine at precisely which points this information is required. 


e Is Transcendental 365 


with integer coefficients. When raised to the pth power this becomes an even 
more complicated polynomial 


xP + ee He + (n!)?, 


Thus MM can be written in the form 


np 
= ᾿ 1 C, ἢ xP ltag—z dx, 
4, (p — 1)! 0 


where the C, are certain integers, and Cy) = +(n!)?. But 
is xke~* dx = ki. 
Thus 


πο ΞΕ)", 
μὰ Σα ese 


Now, for a = 0 we obtain the term 
(fee 
(p — 1)! 


We will now consider only primes p > n; then this term is an integer which is 
not divisible by p. On the other hand, if ἃ > 0, then 


+ (n!)? ————— +(n!)?. 


— 1 ! 
οι, BRIE OL οι α- Ὁφ Ἔα -2) "τ 
(p -- 1)! | 
which 7s divisible by p. Therefore M itself is an integer which is not divisible 


by p. 


Now consider Μᾳ. We have 


ied ee 


(p- 1)! 
-[πτῦ ee a 
: (p — 1)! : 


This can be transformed into an expression looking very much like M by the 
substitution 


u=x—k, 
du = dx. 
The limits of integration are changed to 0 and οὐ, and 
M, = [ (u + bP Tu +k - v, ies iS ee Ca at at a 
p= : 


There is one very significant difference between this expression and that for M. 
The term in brackets contains the factor wu in the kth place. Thus the pth power 
contains the factor u?. This means that the entire expression 


(uth)? [ὦ -Ἑ ἀ --- 1) τ (ὦ - ἃ -- η)] 


366 Infinite Sequences and Infinite Series 


is a polynomial with integer coefficients, every term of which has degree at least p. 
Thus 


My = Ya De [τος ΤΣ fe ἀρ αν... 
Levis ὦ — 1)! 


where the D, are certain integers. Notice that the summation begins with 
α = 1; in this case every term in the sum 15 divisible by p. Thus each M;, is an 
integer which is divisible by ρ. 
Now it is clear that 
k M;, + ἐκ 
= ------- 
Μ 
Substituting into (*) and multiplying by 4 we obtain 
lapM + αι 1] + a + anM,, | + [ἀτεὶ + we + An€En| aa 0. 
In addition to requiring that p > π let us also stipulate that p > |ao|. This 
means that both M and a) are not divisible by ρ, so aoM is also not divisible 
by p. Since each M,, is divisible by ρ, it follows that 
agM a aiM, + py +- a,M 


is not divisible by p. In particular it is a nonzero integer. 
In order to obtain a contradiction to the assumed equation (*), and thereby 
prove that ὁ is transcendental, it is only necessary to show that 


|a1€1 ape ee + an€n| 


can be made as small as desired, by choosing p large enough; it is clearly sufh- 
cient to show that each [εμ can be made as small as desired. This requires 
nothing more than some simple estimates; for the remainder of the argument 
remember that n is a certain fixed number (the degree of the assumed poly- 
nomial equation (*)). To begin with, if 1 <&# <n, then 


LP -- 1) (eo an) Pe 


» K=l,... yn. 


εκ] <e dx 
(aoe oe 
nm yP—1 = ἘΠῚ D\ ye 
se prtesNeeienaty 
0 py 
Now let A be the maximum of |(x — 1) + + + (x -- n)| for x in [0, n]. Then 


NyP—-1 AP fn 
lex] < ἐξ: : | ε΄ dx 
(p -- 1)! Jo 


e"nP-1AP [5 
«ρα oa 
0 


~ (p= 4): 
_ etn? “1AP 
Ξ 1}: 


Ces a eee ae 


hea Tl ᾧ - 1): 


els Transcendental 367 


But n and A are fixed; thus (nA)?/(p — 1)! can be made as small as desired 
by making p sufficiently large. J 


This proof, like the proof that π is irrational, deserves some philosophic 
afterthoughts. At first sight, the argument seems quite ‘‘advanced”’—after all, 
we use integrals, and integrals from 0 to οὐ at that. Actually, as many mathe- 
maticians have observed, integrals can be eliminated from the argument com- 
pletely; the only integrals essential to the proof are of the form 


° Uk 2 
] ΕἾΝ aX, 
0 


for integral k, and these integrals can be replaced by ἀ! whenever they occur. 
Thus M, for example, could have been defined initially as 


We yi. ῳ -- 1 Ὁ α)! 
(p -- 1)! 


where C, are the coefficients of the polynomial 
(x -- 1) “τ (α — πὴ]. 


If this idea is developed consistently, one obtains a ‘‘completely elementary” 
proof that ὁ is transcendental, depending only on the fact that 


1 1 1 
δ. ag pg ag δὰ 


Unfortunately, this “‘elementary” proof is harder to understand than the 
original one—the whole structure of the proof must be hidden just to eliminate 
a few integral signs! This situation is by no means peculiar to this specific 
theorem—‘‘elementary”? arguments are frequently more difficult than 
‘“‘advanced”’ ones. Our proof that 7 is irrational is a case in point. You proba- 
bly remember nothing about this proof except that it involves quite a few com- 
plicated functions. There is actually a more advanced, but much more con- 
ceptual proof, which shows that 7 is transcendental, a fact which is of great 
historical, as well as intrinsic, interest. One of the classical problems of Greek 
mathematics was to construct, with compass and straightedge alone, a square 
whose area is that of a circle of radius 1. This requires the construction of a 


line segment whose length is Vr, which can be accomplished if a line segment 
of length z 15 constructible. The Greeks were totally unable to decide whether 
such a line segment could be constructed, and even the full resources of 
modern mathematics were unable to settle this question until 1882. In that 
year Lindemann proved that 7 is transcendental; since the length of any seg- 
ment that can be constructed with straightedge and compass can be written 
in terms of +,:°, —, +, and Vv , and is therefore algebraic, this proves that a 
line segment of length 7 cannot be constructed. 

The proof that 7 is transcendental requires a sizable amount of mathematics 


368 Infinite Sequences and Infinite Series 


which is too advanced to be reached in this book. Nevertheless, the proof is 
not much more difficult than the proof that ¢ is transcendental. In fact, the 
proof for 7 is practically the same as the proof for e. This last statement should 
certainly surprise you. The proof that ὁ is transcendental seems to deperd so 
thoroughly on particular properties of 6 that it is almost inconceivable how 
any modifications could ever be used for 7; after all, what does e have to do 
with a? Just wait and see! 


PROBLEMS 


1. (a) Prove that if a > 0 is algebraic, then Va is algebraic. 
(b) Prove that if @ is algebraic and r is rational, then ἃ + 7 and ar are 
algebraic. 


Part (b) can actually be strengthened considerably: the sum, product, 
and quotient of algebraic numbers is algebraic. This fact is too diff- 
cult for us to prove here, but some special cases can be examined: 

2. Prove that V2 + V3 and V 2(1 + V3) are algebraic, by actually find- 
ing algebraic equations which they satisfy. (You will need equations of 
degree 4.) 

*3. (a) Let a be an algebraic number which is not rational. Suppose that @ 
satisfies the polynomial equation 


F(x) = ἄρα" + ape" + + + τ + ao = 0, 


and that no polynomial function of lower degree has this property. 
Show that f(p/qg) ᾽ξ 0 for any rational number p/g. Hint: Use Prob- 
lem 3-7(b). 

(b) Now show that |f(/q)| > 1/q” for all rational numbers p/q with 
q > 0. Hint: Write {(p/q) as a fraction over the common denominator 

(c) Let M = sup{|f’(x)|: |x — αἱ < 1}. Use the Mean Vaiue Theorem 
to prove that if p/g is a rational number with |a — p/q| < 1, then 
la — p/q| > 1/Mgq”. (It follows that for ς = max(1, 1/M) we have 
ja — p/q| > ¢/q” for all rational p/q.) 

*4, Let 
a = 0.110010000000000000000001000 ..., 


where the 1’s occur in the n! place, for each n. Use Problem 3 to prove 
that α is transcendental. (For each n, show that α is not the root of an 
equation of degree n.) 


Although Problem 4 mentions only one specific transcendental number, it 
should be clear that one can easily construct infinitely many other numbers ἃ 
which do not satisfy ja — p/g| > c/q” for any ¢ and n. Such numbers were 
first considered by Liouville (1809-1882), and the inequality in Problem 3 is 


ὁ Is Transcendental 369 


often called Liouville’s inequality. None of the transcendental numbers con- 
structed in this way happens to be particularly interesting, but for a long time 
Liouville’s transcendental numbers were the only ones known. This situation 
was changed quite radically by the work of Cantor (1845-1918), who showed, 
without exhibiting a single transcendental number, that most numbers are 
transcendental. The next two problems provide an introduction to the ideas 
that allow us to make sense of such statements. The basic definition with which 
we must work is the following: A set A is called countable if its elements can 
be arranged in a sequence 


Q1, 22, @3, 24,» - «+ > 


The obvious example (in fact, more or less the Platonic ideal of) a countable 
set is N, the set of natural numbers; clearly the set of even natural numbers is 


also countable: 
Be Ao Os Oy Sess ers 


It is a little more surprising to learn that Z, the set of all integers (positive, 
negative, and 0) is also countable, but seeing is believing: 


ig eon ey ee, es ee 


The next two problems, which outline the basic features of countable sets, are 
really a series of examples to show that (1) a lot more sets are countable than 
one might think and (2) nevertheless, some sets are not countable. 


*5. (a) Show that if A and B are countable, then so is AU B = {x: xisin A 
or x is in B}. Hint: Use the same trick that worked for Z. 

(b) Show that the set of positive rational numbers is countable. (This is 
really quite startling; use the following look-and-see proof: 


“>? 


ἔτ ει 
ΓΟ ane 
faa 


(c) Show that the set of all pairs (m, n) of integers is countable. (This is 
practically the same as part (b).) 
(4) If Ay, Ao, A3, . . . are each countable, prove that 


A,WUA,UA3U°°*° 


is also countable. (Again use the same trick as in part (b).) 
(e) Prove that the set of all triples (/, m, n) of integers is countable. (A 
triple (/, m, n) can be described by a pair (/, m) and a number n.) 
(f) Prove that the set of all n-tuples (a1, a2, . . - , an) are countable. (If 
you have done part (e), you can do this, using induction. ) 


370 Infinite Sequences and Infinite Series 


*6. 


(6) Prove that the set of all roots of polynomial functions of degree n is 
countable. (Part (f) shows that the set of all polynomial functions of 
degree n can be arranged in a sequence, and each of these functions 
has at most n roots.) 

(h) Now use parts (d) and (g) to prove that the set of all algebraic num- 
bers is countable. | 

Since so many sets turn out to be countable, it is important to note that the 

set of all real numbers between 0 and 1 is not countable. In other words, 

there is no way of listing all these real numbers in a sequence 


QQ = 0.a1'a21a31a,! 
Qe = 0.a;7a07a37a4? 
a3 => 0.a1°ae%a3%a43 


(decimal notation is being used on the right). To prove that this is so, 
suppose such a list were possible and consider the decimal 


0.41} 4.2 4444. . .., 
where a,” = 5ifa,” ~ 5and4a,”" = 6ifa,” = 5. Show that this number 
cannot possibly be in the list, thus obtaining a contradiction. 


Problems 5 and 6 can be summed up as follows. The set of algebraic num- 


bers is countable. If the set of transcendental numbers were also countable, 
then the set of all real numbers would be countable, by Problem 5(a), and 
consequently the set of real numbers between 0 and 1 would be countable. 
But this is false. ‘Thus, the set of algebraic numbers is countable and the set of 
transcendental numbers is not (‘‘there are more transcendental numbers than 
algebraic numbers’’). The remaining two problems illustrate further how 
important it can be to distinguish between sets which are countable and sets 
which are not. 


“7 


Let f be a nondecreasing function on [0, 1]. Recall (Problem 8-8) that 
{ΠῚ ἢ (x) and lim f(x) both exist. 


2 a~ 


(a) For any ¢ > 0 prove that there are only finitely many numbers a 
in [0, 1] with eae f(x) -- lim f(x) > ¢. Hint: There are, in fact, at 


most [f(1) — f(0)/e of tens: 
(b) Prove that the set of points at which f is discontinuous is countable. 
Hint: If lim f(x) — lim f(x) > 0, then it is > 1/n for some natural 


number n. 


This problem shows that a nondecreasing function is automatically 
continuous at most points. For differentiability the situation is more 
difficult to analyze and also more interesting. A nondecreasing func- 


*8. 


(a) 


(b) 
(c) 


(d) 


e Is Transcendental 371 


tion can fail to be differentiable at a set of points which is not count- 
able, but it is still true that nondecreasing functions are differentiable 
at most points (in a different sense of the word “‘most’’). Reference 
[33] of the Suggested Reading gives a beautiful proof, using the 
Rising Sun Lemma of Problem 8-20. For those who have done 
Problem 9 of the Appendix to Chapter 11, it is possible to provide 
at least one application to differentiability of the ideas already 
developed in this problem set: If f is convex, then f is differentiable 
except at those points where its right-hand derivative f,’ is discon- 
tinuous; but the function ἔμ΄ is increasing, so a convex function is 
automatically differentiable except at a countable set of points. 
Problem 11-50 showed that if every point is a local maximum point 
for a continuous function 7, then f is a constant function. Suppose now 
that the hypothesis of continuity is dropped. Prove that f takes on 
only a countable set of values. Hint: For each x choose rational num- 
bers az and ὁ, such that a, < x < ὦ; and x is a maximum point for 
fon (az, bz). Then every value f(x) is the maximum value of f on some 
interval (az, bz). How many such intervals are there? 

Now deduce Problem 11-50 as a corollary of part (a). 

Suppose that every point of f is either a local maximum or a local 
minimum point for f. Prove that f takes on only a countable set of 
values. (This is a minor variant of part (a).) 

Now suppose that f is continuous, and that every point is either a 
local maximum or a local minimum point for f. Show that f is a con- 
stant function. (Although this statement is only a minor variant of 
Problem 11-50, the only proof I know uses part (c), and thus relies on 
very different ideas than the ones in Chapter 11.) 


CHAPTER 


DEFINITION 


INFINITE SEQUENCES 


The idea of an infinite sequence is so natural a concept that it is tempting to 
dispense with a definition altogether. One frequently writes simply “Δ infinite 
sequence | 


25 
Qi, A2, 23, 24, αἀδ,).. . ..} 


the three dots indicating that the numbers a; continue to the right “forever.” 
A rigorous definition of an infinite sequence is not hard to formulate, however; 
the important point about an infinite sequence is that for each natural num- 
ber, π, there is a real number a,. This sort of correspondence is precisely what 
functions are meant to formalize. 


An infinite sequence of real numbers is a function whose domain is N. 


From the point of view of this definition, a sequence should be designated 
by a single letter like a, and particular values by 


a(1), a(2), a(3), . . + >» 


but the subscript notation 


αι), G2; 23, + - 


is almost always used instead, and the sequence itself is usually denoted by a 
symbol like {an}. Thus {x}, {(—1)”}, and {1/n} denote the sequences a, 8, 
and + defined by 


Aan ΞΞ ἢ, 
Bn = C=)", 
1 
‘qo τὸ 
n 


A sequence, like any function, can be graphed (Figure 1) but the graph 15 
usually rather unrevealing, since most of the function cannot ke fit on the page. 


FIGURE 1 
372 


DEFINITION 


Infinite Sequences 373 


0 Ys ¥3 γι 
FIGURE 2 


A more convenient representation of a sequence is obtained by simply 
labeling the points ai, a2, a3, . . . on a line (Figure 2). This sort of picture 
shows where the sequence “15 going.’ The sequence {an} “goes out to 
infinity,’ the sequence {G,} ‘‘jumps back and forth between —1 and 1,” and 
the sequence {γΎη} ‘“‘converges to 0.”’ Of the three phrases in quotation marks, 
the last is the crucial concept associated with sequences, and will be defined 
precisely (the definition is illustrated in Figure 3). 


An+8 Qn41 


Qn44 αν an i+eé a3 a2 ay 


ag 


FIGURE 3 


A sequence {a,} converges to / (in symbols lim a, = /) if for every ¢ > 0 


η- οὦ 


there is a natural number WN such that, for all natural numbers 2, 


ifn > N, then ja, — ἰἰ < ς. 


In addition to the terminology introduced in this definition, we sometimes 
say that the sequence {a,} approaches ! or has the limit 1. A sequence {an} is 
said to converge if it converges to / for some /, and to diverge if it does not 
converge. 

To show that the sequence {y,} converges to 0, it suffices to observe the 
following. If ¢ > 0, there is a natural number WN such that 1/N < ς. Then, if 
n > N we have 


<¢, so |Y¥n — 0] «ς. 


Pee 
eae 


The limit 
lim Vn +1—-WVn =0 


Nn οὐ 


will probably seem reasonable after a little reflection (it just says that Vant1 
is practically the same as Vn for large n), but a mathematical proof might not 


374 Infinite Sequences and Infinite Series 


be so obvious. To estimate Wn + 1 — Vn we can use an algebraic trick: 


IY ΑΞ -- Va(Va+1+ Vn) 
Vantitwvn 
3, spl aan 
Vn +i+ Vn 
— 1 a! | 
Vn +1+ Vn 
It is also possible to estimate Vn ase om Vn by applying the Mean Value 
Theorem to the function f(x) = x on the interval [n, n + 1]. We obtain 


Vise iS 


Sie eV i 

WAIN = fe) 
ar for some x in (n, n + 1) 
π᾿ 
2.5 


Either of these estimates may be used to prove the above limit; the detailed 
proof is left to you, as a simple but valuable exercise. 


The limit 
3n3 + Tn? +1 3 


im = 

N—> 00 4n3 = 8n + 63 4 
should also seem reasonable, because the terms involving n* are the most 
important when n is large. If you remember the proof of Theorem 7-9 you will 


be able to guess the trick that translates this idea into a proof—dividing top 
and bottom by n? yields 


7 1 
(ge aaa τ: 
le a Coa a ς Lees 
3 2) 
An 8n + 63 duet ae 
n? 3 


Using this expression, the proof of the above limit is not difficult, especially if 
one uses the following facts: | 
If lim a, and lim ὦ, both exist, then 
lim (a, + bn) = lim a, + lim dy, 
lim (a, " bn) = lim a, ° lim 4,; 


moreover, if lim ὁ, τέ 0, then 6, σέ 0 for all n greater than some N, and 


7γι-- © 


lim a,/b, = lim a,/lim by. 


n~> οὦ na nt οὦ 


Infinite Sequences 375 


(If we wanted to be utterly precise, the third statement would have to be even 
more complicated. As it stands, we are considering the limit of the sequence 
{cn} = {an/bn}, where the numbers c, might not even be defined for certain 
n < N. This doesn’t really matter—we could define c, any way we liked for 
such n——because the limit of a sequence is not changed if we change the 
sequence at a finite number of points.) 

Although these facts are very useful, we will not bother stating them asa 
theorem—you should have no difficulty proving these results for yourself, 
because the definition of lim a, = 7 is so similar to previous definitions of 


n—- 0 


limits, especially lim f(x) = J. 


The similarity between the definitions of lim a, = / and lim f(x) = / is 


actually closer than mere analogy; it is possible to define the first in terms of 
the second. If fis the function whose graph (Figure 4) consists of line segments 


FIGURE 4 


joining the points in the graph of the sequence {an}, so that 


f(x) = (Qn41 — @n)(x—n) tan, nxn 1, 
then 
lima, Ξ ifandonlyif hm f(x) = /. 
This observation is frequently very useful. For example, suppose that 
0<a<i1. Then 
lim a” = 0. 


n> © 


To prove this we note that 


lim αὖ = lim e7!os¢ 


rI—> ὃ I— ὦ 
= 0, 
since log a < 0, so that x log a is negative and large in absolute value for 


large x. Notice that we actually have 


lim a” = 0 if jal < 1; 
for if a < 0 we can write 
lim a” = lim (—1)"/a|" = 0. 


lh ὦ 


376 Infinite Sequences and Infinite Series 


THEOREM 1 


PROOF 


The behavior of the logarithm function also shows that if a > 1, then a” 
becomes arbitrarily large as n becomes large. This assertion is often written 
lima” =», a>1l, 
nr > © 
and it is sometimes even said that {a”} approaches ©. We also write equations 
like 
lim —a" = -- ο, 
n> 0 
and say that {—a”} approaches — «. Notice, however, that ifa < —1, then 
lim a” does not exist, even in this extended sense. 


ἜΝ 

Despite this connection with a familiar concept, it is more important to 
picture convergence in terms of the picture of a sequence as points on a line 
(Figure 3). There is another connection between limits of functions and limits 
of sequences which is related to thzs picture. ‘This connection is somewhat less 
obvious, but considerably more interesting, than the one previously mentioned 
~——instead of defining limits of sequences in terms of limits of functions, it is 
possible to reverse the procedure. 


Let f be a function defined in an open interval containing c, except perhaps 
at ὁ itself, with 


lim f(x) = 2. 
Suppose that {a,} is a sequence such that 


(1) each a, is in the domain of f, 
(2) each an ¥ ¢, 
(3) lim aq =. 


rn 0 


Then the sequence { f(a,)} satisfies 
lim f(an) = l. 


N— 0 
Conversely, if this is true for every sequence {a,} satisfying the above condi- 
tions, then lim f(x) = /. 


xz C 


Suppose first that lim f(x) = /. Then for every ¢ > 0 there is a 6 > 0 such 


z—C 


that, for all x, 
if 0 < |x -- εἰ < ὃ, then |f(x) — /| < «. 
If the sequence {a,} satisfies lim a, = c, then (Figure 3) there is a natural 


nr © 


number WN such that, 
ifn > N, then |a, — c| < 6. 


By our choice of 6, this means that 


|f(an) 1s; 


FIGURE 5 


THEOREM 2 


PROOF 


Infinite Sequences 377 


showing that 
lim f(an) = J. 


n— © 


Suppose, conversely, that lim f(u,) = / for every sequence {a,} with 


n— 0 


lim a, = c. If lim f(x) = / were not true, there would be some é¢ > 0 such 


N— οὐ wc 


that for every 6 > Ὁ there is an x with 
O<|x-—cl <6 but [ἡ —J >. 


In particular, for each n there would be a number x, such that 
1 
O0< [xx —cl<- but [f@,) -—l>«. 
n 


Now the sequence {x,} clearly converges to c but, since |f(x,) — /| > ¢ for 
all n, the sequence {f(xn)} does not converge to /. This contradicts the 
hypothesis, so lim f(x) = / must be true. J 


Theorem 1 provides many examples of convergent sequences. For example, 
the sequences {a,} and {d,} defined by 


Qn = sin (13 +- τ) 
n 


cos (sin (1 + (—1)": -)) 


clearly converge to sin(13) and cos(sin(1)), respectively. It is important, 
however, to have some criteria guaranteeing convergence of sequences which 
are not obviously of this sort. There is one important criterion which is very 
easy to prove, but which is the basis for all other results. This criterion is stated 
in terms of concepts defined for functions, which therefore apply also to 
sequences: a sequence {a,} is increasing if a,41 > a, for all n, nondecreasing 
If @n41 > a» for all n, and bounded above if there is a number / such that 
adn < M for all n; there are similar definitions for sequences which are de- 
creasing, nonincreasing, and bounded below. 


bn 


I 


If {a,} is nondecreasing and bounded above, then {a,} converges (a similar 
statement is true if {a,} is nonincreasing and bounded below). 


The set A consisting of all numbers a, is, by assumption, bounded above, so A 


has a least upper bound a. We claim that lim a, = a (Figure 5). In fact, 


if ¢ > 0, there is some aw satisfying a — ay < ξ, since α is the least upper 
bound of A. Then if n > N we have 


Qn 2 an, SO ἃ --ξδΊ, Φα -- ,ἂὴν < ξ. 


This proves that lim a, = a. ἢ 


n> @ 


378 Infinite Sequences and Infinite Series 


graph of a 


2 and 6 are peak points 


7 8 9 10 11 


FIGURE 6 


LEMMA 


PROOF 


COROLLARY (THE 
BOLZANO-WEIERSTRASS THEOREM) 


The hypothesis that {a,} is bounded above is clearly essential in Theorem 
2: if {a,} is not bounded above, then (whether or not {a,} is nondecreasing) 
{an} clearly diverges. Upon first consideration, it might appear that there 
should be little trouble deciding whether or not a given nondecreasing 
sequence {a,} is bounded above, and consequently whether or not {an} con- 
verges. In the next chapter such sequences will arise very naturally and, as 
we shall see, deciding whether or not they converge is hardly a trivial matter. 
For the present, you might try to decide whether or not the following (ob- 
viously increasing) sequence is bounded above: 


ee ee a eee ee ee 
Although Theorem 2 treats only a very special class of sequences, it is more 
useful than might appear at first, because it is always possible to extract from 
an arbitrary sequence {a,} another sequence which is either nonincreasing or 


else nondecreasing. To be precise, let us define a subsequence of the sequence 
{an} to be a sequence of the form 


Ani Any Ing + + + > 
where the n; are natural numbers with 
πι “ ne <3 < 


Then every sequence contains a subsequence which is either nondecreasing 
or nonincreasing. It is possible to become quite befuddled trying to prove this 
assertion, although the proof is very short if you think of the right idea; it 15 
worth recording as a lemma. 


Any sequence {a,} contains a subsequence which is either nondecreasing or 
nonincreasing. | 


Call a natural number n a “peak point” of the sequence {an} if am < an for 
all m > n (Figure 6). 


Case 1. The sequence has infinitely many peak points. In this case, if m1 < m2 < 
n3 < "τ" are the peak points, then an, > dn, > an, > * * * » 80 {an,} is the 
desired (nonincreasing) subsequence. 


Case 2. The sequence has only finitely many peak points. In this case, let πὶ be 
greater than all peak points. Since πὶ is not a peak point, there is some n2 > 71 
such that dn, > a@n,. Since mz is not a peak point (it is greater than πὶ, and hence 


ereater than all peak points) there is some n3 > m2 such that a,, > dn, Con- 
tinuing in this way we obtain the desired (nondecreasing) subsequence. J 


If we assume that our original sequence {a,} is bounded, we can pick up 
an extra corollary along the way. 


Every bounded sequence has a convergent subsequence. 


DEFINITION 


THEOREM 3 


PROOF 


Infinite Sequences 379 


Without some additional assumptions this is as far as we can go: it 1s easy 
to construct sequences having many, evenly infinitely many, subsequences 
converging to different numbers (see Problem 2). There is a reasonable 
assumption to add, which yields a necessary and sufficient condition for 
convergence of any sequence. Although this condition will not be crucial for 
our work, it does simplify many proofs. Moreover, this condition plays a funda- 
mental role in more advanced investigations, and for this reason alone it is 
worth stating now. 

If a sequence converges, so that the individual terms are eventually all close 
to the same number, then the difference of any two such individual terms 
should be very small. To be precise, if lim a, = / for some /, then for any 


€¢ > 0 there is an N such that |a, — /| < ¢/2 forn > N; now if bothn > N 
and m > N, then 


ξ 
lan — am| < lan -- ᾷ{-ἘῪΜΊϊ -- αἱ <= += = ξ. 
This final inequality, jan — am| < ξ, which eliminates mention of the limit /, 


can be used to formulate a condition (the Cauchy condition) which is clearly 
necessary for convergence of a sequence. 


A sequence {a,} isa Cauchy sequence if for every ¢ > 0 there is a natural 
number WN such that, for all m and n, 


if m,n > N, then |an — aml < ξ. 
(This condition is usually written lim |am — απ] = 0.) 


m,n— τὸ 


The beauty of the Cauchy condition is that it is also sufficient to ensure 
convergence of a sequence. After all our preliminary work, there is very little 
left to do in order to prove this. 


A sequence {a,} converges if and only if it is a Cauchy sequence. 


We have already shown that {a,} is a Cauchy sequence if it converges. The 
proof of the converse assertion contains only one tricky feature: showing that 
every Cauchy sequence {an} is bounded. If we take ¢ = 1 in the definition of 
a Cauchy sequence we find that there is some N such that 


lam — an| <1 form,n> Ν. 
In particular, this means that 
lam — @nai{ <1 for allm>N. 


Thus {am: m > N} is bounded; since there are only finitely many other a;’s 
the whole sequence is bounded. 


380 = Infinite Sequences and Infinite Series 


The corollary to the Lemma thus implies that some subsequence of {an} 
converges. 

Only one point remains, whose proof will be left to you: if a subsequence of 
a Cauchy sequence converges, then the Cauchy sequence itself converges. 


PROBLEMS 


1. Verify each of the following limits. 


n 
i) hi =e ie 

οὐ 

a. tp 
nae n> + 4 

(11) lim Vn? +1—Wn+1 = 0. Hint: You should at least be able 
to prove that lim Vn? +1 — Vn? = 0. 

(iv) lim ” = 0. Hint: 2! = n(n —1) ++: k! fork <n, in particular, 
no 7} 
for k < n/2. 

(v) lim Va = 1, a> 0. 

(vi) lim Vn = 1. 

(vii) lim Vn? +n = 1. 

(viii) lim Va" + 6" = max(a, δ). 

(ix) lim ay = 0, where a(n) is the number of primes which divide n. 


noo 7 
Hint: The fact that each prime is > 2 gives a very simple estimate 
of how small a(n) must be. 


yey 
*(x) lim “Ξ - = 


2. (a) What can be said about the sequence {a,} if it converges and each 
dy, iS an integer? 

(b) Find all convergent subsequences of the sequence 1, —1, 1, — 1.1 
—1,..... (There are infinitely many, although there are only two 
limits which such subsequences can have.) 

(c) Find all convergent subsequences of the sequence 1, 2, 1, 2, 3, 1, 2, 
3,4, 1, 2, 3, 4,5, ... . (There are infinitely many limits which 
such subsequences can have.) 


Infinite Sequences 381 


(d) Consider the sequence 


1214623 #i12 3 4 


For which numbers ἃ is there a subsequence converging to αϑ 

(a) Prove that if a subsequence of a Cauchy sequence converges, then 
so does the original Cauchy sequence. 

(b) Prove that any subsequence of a convergent sequence converges. 


(a) Prove that if 0 <a < 2, then a « V2a < 2. 
(b) Prove that the sequence 


V2, V2V72, V2-V2 V2... 
converges. ΝΕ 
(c) Find the limit. Hint: Notice that if lim a, = /, then lim ν2α, = 


n— © n— οὐ 


νι, by Theorem 1. 


Identify the function f(x) = lim (lim (cos n!rx)?*). (It has been men- 
m0 ke 


tioned many times in this book.) 

Many impressive looking limits can be evaluated easily (especially by the 
person who makes them up), because they are really lower or upper 
sums in disguise. With this remark as hint, evaluate each of the following. 
(Warning: the list contains one red herring which can be evaluated by 
elementary considerations.) 


a Met Vetere + Ver 
ΕΝ δ ee ee, 


4) ili 
N— 00 nN 

aie Vet Vertes + Ve" 
n-> © n 

i) tim Gt +55) 


1 1 1 
(iv) lim a Gai ) 


( 
(v) lim (—*— + Ap Ὁ ) 


SG Gane, Ὅτ 
: i n 7 7 
πῶ δ cy haat Ἐπ 4) 


Although limits like lim Vn and lim a” can be evaluated using facts 


n> Ὁ n> Ὁ 
about the behavior of the logarithm and exponential functions, this 
approach is vaguely dissatisfying, because integral roots and powers can 
be defined without using the exponential function. Some of the standard 
“elementary” arguments for such limits are outlined here; the basic 
tooks are inequalities derived from the binomial theorem, notably 


(1 +A)” >1+ nA, for h> 0; 


382 Infinite Sequences and Infinite Series 


and, for part (c), 


(1 +A > 1 tnt MAD pe > MED ip for h > 0. 


(a) 


(b) 
(c) 


(d) 
(e) 


8. (a) 
(b) 


9. (a) 


(b) 


10. (a) 


2 
Prove that lim a” = o if a> 1, by setting ἃ = 1+ Ah, where 


h> 0. 
Prove that lim a” = 0i1f0 <a «1. 


na ® 


Prove that lim Va =1 if a> 1, by setting Va=1+h and 


n— 0 


estimating A. 


Prove that lim Va = 1 if 0 <a@< 1. 
Prove that lim Vn — a 


Prove that a convergent sequence is always bounded. 
Suppose that lim a, = 0, and that each a, > 0. Prove that the set 


γι 0 


of all numbers a, actually has a maximum member. 


Prove that 
Ὡς 2G (n+ 1) — lo es 
a ee ae 
If 
1 1 
i Se eid εἰν εν βολὴ, 
2 3 n 


show that the sequence {a,} is decreasing, and that each a, > Ὁ. 
It follows that there is a number 


1 

Y= lim (1 + ar ἘΣ — log) 
n— ὦ n 

This number, known as Euler’s number, has proved to be quite 

refractory; it is not even known whether Ὕ is rational. 

Suppose that f is increasing on [1, «). Show that 


FOR er iil) ν er ὩΣ ha), 


(b) 


Now choose / = log and show that 
nn n 1 n+1 n+1 
Je a ee 
git 1 67: en 1 
it follows that 
Vni 1 


*11. 


ἜΩ, 


*13. 


14. 


15. 


Infinite Sequences 383 


This result shows that Vn! is approximately n/e, in the sense that 
the ratio of these two quantities is close to 1 for large n. But we cannot 
conclude that n! is close to (n/e)” in this sense; in fact, this is false. 
An estimate for n! is very desirable, even for concrete computa- 
tions, because n! cannot be calculated easily even with logarithm 
tables. The standard (and difficult) theorem which provides the 
right information will be found in Problem 26-18. 

This problem investigates for which x > 0 the symbol 


al 


makes sense. In other words, if we define ai(x) = x, @nayi(x) = χα"), 
when does h(x) = lim a,(x) exist? 
"πο 


(a) Prove that if b(x) exists, then x°™ = (x). 
(The situation is similar to that in Problem 4.) 

(b) According to part (a), if b(x) exists, then x can be written in the form 
γὴν for some y. Conclude that 0 < x « 6"), Hint: Consider the 
graph of f(y) = (log y)/y, as in Problem 17-24. 

(c) Suppose, conversely, that 0 < x < e!/*. Show that each an(x) < ὁ; 
since {a,(x)} is clearly increasing, this proves that b(x) exists (and 
also that b(x) < e). 

(d) Find b(V2) and b(e/). 

(ce) Prove that ὁ is differentiable on (0, e!/*), and find a formula for 
b'(x) in terms of 6(x). Hint: This part is similar to Problem 17-24(/). 

(f) Derive a formula for b’(x) by differentiating the equation xo = 
b(x). (Of course, this method does not prove that ὁ is differentiable.) 

Prove that if lim a, = ἐ, then 


n— © 


tm (δι oe Oe es 


n— 00 n 


l. 


Hint: This problem is very similar to (in fact it is a special case of) 
Problem 13-34. 
Suppose that a, ~ 0 for each n and that lim an41/an = 1. Prove that 


n> «© 


lim Va, = ἰ. Hint: This requires the same sort of argument «hat works 


n— © 


in Problem 12, together with the fact that lim Va = 1, fora> 0. 


(a) Suppose that {a,} is a convergent sequence of points all in [0, 1]. 
Prove that lim a, is also in [0, 1]. 


Nn © 


(b) Find a convergent sequence {a,} of points all in (0, 1) such that 
lim a, is not in (0, 1). 


n— οὦ 


Suppose that f is continuous and that the sequence 


x, f(x), FF), FFF@))), + 


384 Infinite Sequences and Infinite Series 


FIGURE 7 


16. 


converges to /. Prove that / is a “‘fixed point” for f, ie., f(7) = /. Hint: 
Two special cases have occurred already. 


(a) 


*(b) 


Suppose that f is continuous on [0, 1] and that 0 < f(x) < 1 for all 
x in [0, 1]. Problem 7-11 shows that f has a fixed point (in the 
terminology of Problem 15). If f is zncreasing, a much stronger state- 
ment can be made: For any «x in [0, 1], the sequence 


x, Tos f(f(x)), gn νίαν τὸ 


has a limit (which is necessarily a fixed point, by Problem 15). Prove 
this assertion, by examining the behavior of the sequence for f(x) > x 
and f(x) < x, or by looking at Figure 7. A diagram of this sort is 
used in Littlewood’s Mathematician’s Miscellany to preach the value 
of drawing pictures: ‘‘For the professional the only proof needed is 
[this Figure ].”’ 

Suppose that f and g are two continuous functions on [0, 1], with 
0 < f(x) <1 and 0 < g(x) < 1 for all x in [0, 1}, which satisfy 
feg = go f. Suppose, moreover, that f is increasing. Show that 
f and g have a common fixed point; in other words, there is a num- 
ber 7 such that (ἢ, = / = g(/). Hint: Begin by choosing a fixed 
point for g. 


***(c¢) Does the conclusion of part (b) hold without the assumption that 


f is increasing? 


The trick in Problem 15 is really much more valuable than Problem 15 
might suggest, and some of the most important “‘fixed point theorems” depend 
upon looking at sequences of the form from x, f(x), f(f(x)), . . . . A special, 
but representative, case of one such theorem is treated in Problem 18 (for 
which the next problem is preparation). 


17. 


*18. 


(a) Use Problem 2-4 to show that if κέ 1, then 
m ,ἈΞῚ 
Pte eG eee πὰ oe ‘ 
1--Ἤ 
(b) Suppose that jc] « 1. Prove that 
lim οὖ ἝΞ. Ἢ οὖ = 0. 


(c) 


m,n—> ὦ 


Suppose that {x,} is a sequence with |x, ~— xn4i| “ οὖ, where 
ς < 1. Prove that {x,} is a Cauchy sequence. 


Suppose that f is a function on [a, δ) such that 


(*) [ΧΟ] — fQ)| S εἱκ — )γ], for all x and y in ἴα, ὁ], 


where ὁ < 1. (Such a function is called a contraction.) 


(a) 
(b) 


Prove that f is continuous. 
Prove that f has at most one fixed point. 


19. 


*20. 


Infinite Sequences 385 


(c) By considering the sequence 


x, f(x), fF), os 


for any x in [a, 6], prove that f does have a fixed point. (This result, 
in a more general setting, is known as the ‘‘contraction lemma.’’) 


Let {x,} be a sequence which is bounded, and let 


(a) 


(b) 


(c) 


(d) 


(e) 


Yn = SUp {Xn Xntl)%n42 + - εἰν 


Prove that the sequence {y,} converges. The limit lim y, is denoted 


nr? © 


by lim x, or lim sup x,, and called the limit superior, or upper 


n— © n> @ 


limit, of the sequence {x,}. 
Find lim x, for each of the following: 


n— ὦ 
(i) Xn = 


Cte ed 

nN 
Εἰ)» ἐν πὴ [ Ἕ 1 
(iv) x, = Vn. 


Define lim x, (or lim inf x,) and prove that 


n— © n— Ὁ 


lim x, < lim x,y. 


Tl © n— © 


Prove that lim x, exists if and only if lim x, = lim x, and that in 
n— Ὁ n> ὦ n> «© 
this case lim x, = lim x, = lim xp. 


n— 2 γι & na Ὁ 


Recall the definition, in Problem 8-18, of lim A for é a bounded set 

A. Prove that if the numbers x, are distinct, then lim x, = lim A, 

where A = {x,: n in N}. -_ 

Suppose that {4,} is nonincreasing, with 6, > 0 for each n, and that 
Mh = aye 4 +an<M 

for all n. Prove that 


bm <abit +++ + anbn S OM. 


This result is known as Abel’s Lemma. Although it may not look 
very interesting, it is the basis for several very interesting results 
which will appear in later problem sets; in addition, the proof 


386 Infinite Sequences and Infinite Series 


"eh 


ΩΦ, 


ἘΠ -20 


depends on a rather tricky way of writing aib1 + ++ + + anbn. 
(b) Deduce that 


bum S ακδρ + + * + anbn Φ ὃ} Μ 


(a trivial consequence of part (a), inserted only because it is needed 
later). 


The Bolzano-Weierstrass Theorem is usually stated, and also proved, 
quite differently than in the text—the classical statement uses the notion 
of limit points. A point x is a limit point of the set A if for every ¢ > 0 
there is a point a in A with |x — αἱ < € but x ¥ a. 


(a) Find all limit points of the following sets. 
| ee 
(i) Fe ninN ᾿ 
n 
" 1 1 . 
(ii) [= + <1 mand m in N}. 
nm 


(iii) \(-1)" Ἵ + “|: nin N}. 
(iv) Z. 
(v) Q 


(b) Prove that x is a limit point of A if and only if for every € > 0 there 
are infinitely many points a of A satisfying |x — αἱ < ξ. 

(c) Prove that lim A is the largest limit point of A, and lim A the 
smallest. 
The usual form of the Bolzano-Weierstrass Theorem states that if A 
is an infinite set of numbers contained in a closed interval [a, 6], then 
some point of [a, ὁ] is a limit point of A. Prove this in two ways: 

(d) Using the form already proved in the text. Hint: Since A is infinite, 
there are distinct numbers x1, x2, «3, . . . in A. 

(e) Using the Nested Intervals Theorem. Hint: If [a, 6] is divided into 
two intervals, at least one must contain infinitely many points of A. 


Use the Bolzano-Weierstrass Theorem to prove that if f is continuous 
on [a, δ], then f is bounded above on [a, 5]. Hint: If f is not bounded 
above, then there are points x, in [a, 6] with f(x,) > π. 


(a) Let {a,} be the sequence 


Mood, 2. 2 8 de 2 
2> ὃ» 89 45 45 4» Bs 53 


we 


Suppose that 0 < a «ὁ < 1. Let M(n; a, δὴ) be the number of inte- 
gers 7 <n such that a; is in fa, Ὁ]. (Thus M(2; 4, 3) = 2, and 


Infinite Sequences 387 


N(4; 4, 3) = 3.) Prove that 


lim N(n; a, δ) ᾿ 
n 


n— © 


ὦ -- a. 
(b) A sequence {a,} of numbers in [0,1] is called uniformly dis- 
tributed in [0, 1] if 
lim N(n; a, δ) = 


n> © n 


δ τ 
for all a and ὁ with 0 < a «ὁ < 1. Prove that if s is a step function 
defined on [0,1], and {a,} is uniformly distributed in [0, 1], then 
1 bikes 
i Se ee 
0 


NI— 2 7 


(c 


ςς» 


Prove that if {a,} is uniformly distributed in [0, 1] and fis integrable 
on [0, 1], then 


[r= tm felt ~~~ + feo) 


n— ὦ nN 


**24, (a) Let f be a function defined on [0, 1] such that lim f(y) exists for all 
ya 


a in [0, 1]. For any € > 0 prove that there are only finitely many 
points a in [0,1] with |lim f(y) — f(a)| > ε. Hint: Show that the 
ya 


set of such points cannot have a limit point x, by showing that 
lim f(y) could not exist. 
yn x 


(b) Prove that, in the terminology of Problem 20-5, the set of points 
where f is discontinuous is countable. This finally answers the ques- 
tion of Problem 6-16: If f has only removable discontinuities, then 
f is continuous except at a countable set of points, and in particular, 


f cannot be discontinuous everywhere. 


ΝΜ» 


CHAPTER 


INFINITE SERIES 


Infinite sequences were introduced in the previous chapter with the specific 
intention of considering their ‘‘sums”’ 


αι +a, Fast’ 


in this chapter. This is not an entirely straightforward matter, for the sum 
of infinitely many numbers is as yet completely undefined. What can be 
defined are the “‘partial sums’’ 


SoS ee eae, 


and the infinite sum must presumably be defined in terms of these partial 
sums. Fortunately, the mechanism for formulating this definition has already 
been developed in the previous chapter. If there is to be any hope of com- 
puting the infinite sum a; + a, + a3; + °° , the partial sums s, should 
represent closer and closer approximations as n is chosen larger and larger. 
This last assertion amounts to little more than a sloppy definition of limits: 
the “infinite sum” a; + a2 + a3 + + * * ought to be lim s,. This approach 


nN—> οὦ 


will necessarily leave the ‘‘sum” of many sequences undefined, since the 
sequence {s,} may easily fail to have a limit. For example, the sequence 


i a, ee 
with a, = (—1)”*?! yields the new sequence 


δ1 =a, = 1, 

5) = a1 + ὧς = 0, 

$3 = a, ta2+a;3 = 1, 

αι + ao + a3 + a, = 0, 


54 
a ) 


for which lim s, does not exist. Although there happen to be some clever 


n— © 


extensions of the definition suggested here (see Problems 8 and 23-11) it 
seems unavoidable that some sequences will have no sum. For this reason, an 
acceptable definition of the sum of a sequence should contain, as an essential 
component, terminology which distinguishes sequences for which sums can be 
defined from less fortunate sequences. 


388 


DEFINITION 


Infinite Series 389 


The sequence {a,} is summable if the sequence {s5,} converges, where 
Sn = ait τ ἀρ. 


In this case, lim s, is denoted by 


n— © 
3 


dn (or, less formally, a: + a2 +a3+ °° *) 


n=l 


and is called the sum of the sequence {an}. 


The terminology introduced in this definition is usually replaced by less 
precise expressions; indeed the title of this chapter is derived from such every- 


day language. An infinite sum > a, is usually called an infinite series, the 
n=l 

word “series” emphasizing the connection with the infinite sequence ἰα,}. 

The statement that {a,} is, or is not, summable is conventionally replaced by 


the statement that the series > a, does, or does not, converge. This terminol- 


γι ΞΞ 1 


ogy is somewhat peculiar, because at best the symbol Σ a, denotes ἃ number 


n=l 
(so it can’t “‘converge”), and it doesn’t denote anything at all unless {an} is 
summable. Nevertheless, this informal language is convenient, standard, and 
unlikely to yield to attacks on logical grounds. 
Certain elementary arithmetical operations on infinite series are direct 
consequences of the definition. It is a simple exercise to show that if {an} and 
{5,} are summable, then 


Σ (an + bn) 


n=1 
οὐ [0] 
Σ᾽ cra, ect ) an 
n=l n= 


I 
3 
ips 
S 
3 
+- 
[.σ1: 
- 
Ξ 


As yet these equations are not very interesting, since we have no examples of 
summable sequences (except for the trivial examples in which the terms are 
eventually all 0). Before we actually exhibit a summable sequence, some 
general conditions for summability will be recorded. 

There is one necessary and sufficient condition for summability which can be 
stated immediately. The sequence {a,} is summable if and only if the sequence 
{s,} converges, which happens, according to Theorem 21-3, if and only if 

lim 5, — 5, = 0; this condition can be rephrased in terms of the original 


Mn Ὁ 


390 = Infinite Sequences and Infinite Series 


THE CAUCHY CRITERION 


THE VANISHING CONDITION 


sequence as follows. 


The sequence {a,} is summable if and only if 


lim @agi ἘΠ᾿ tam = 0. 
m,n «0 
Although the Cauchy criterion is of theoretical importance, it is not very 
useful for deciding the summability of any particular sequence. However, one 
simple consequence of the Cauchy criterion provides a necessary condition for 
summability which is too important not to be mentioned explicitly. 


If {a,} is summable, then 
lim a, = 0. 
This condition follows from the Cauchy criterion by taking m = n + 1; it can 
also be proved directly as follows. If lim s, = /, then 


7,)-. © 


lim a, = lim (s, — spi) = lim s, — lim s,_1 


n> 0 nN—> 0 n—> 0 n—> 0 


=/—/=0. 


Unfortunately, this condition is far from sufficient. For example, lim 1/n = 


n—- οὦ 


0, but the sequence {1/n} is not summable; in fact the following grouping of 
the numbers 1/n shows that the sequence {s,} is not bounded: 


D+h+R+E+E+E+H+ et bt tat 
a tt i 
> 4 » > κα 
(2 terms, (4 terms, (8 terms, 
each > }) each > 4) each > yx) 


The method of proof used in this example, a clever trick which one might 
never see, reveals the need for some more standard methods for attacking these 
problems. ‘These methods shall be developed soon (one of them will give an 


alternate proof that > 1/n does not converge) but it will be necessary to first 
n=1 
procure a few examples of convergent series. 
The most important of all infinite series are the ‘geometric series” 


Yraltrtrtet ce 
n=0 


Only the cases |r| < 1 are interesting, since the individual terms do not 
approach 0 if |r| > 1. These series can be managed because the partial sums 


Sn, =lt+trte+: +r" 
can be evaluated in simple terms. The two equations 


So = 1 - γΌΈΈ 72 ον ft 
TS, = γ Ἔγ7ρτΈ τ. - γῆῖ-Ὲ yrtl 


THE BOUNDEDNESS CRITERION 


THEOREM 1 
(THE COMPARISON TEST) 


Infinite Series 391 


lead to 
i= γ) Ξ ber 
or 
= yet 
Sn - CO 
1—r 


(division by 1 — γ is valid since we are not considering the case r = 1). Now 
lim 7” = 0, since |r| < 1. It follows that 


1 -- γπῈ]|: 
ein 15 = Ir] <1. 


In particular, 


SOLO 
n= n=0 ~ 2 
that is, 

bttthtdt τ τ, 


an infinite sum which can always be remembered from the picture in Figure 1. 


0 3 7 ᾿ 1 
--.------ο-----ς--------- +—— Ἤ---- σΓΞἩ.------- - Ἰ- 
-.---φῳωῤ- ὕὔὕὺώὕ..ὄςς-. ,.  .-  ὍἈὺὸὺο-ς ὦν 
$ } i 
FIGURE 1 


Special as they are, geometric series are standard examples from which 
important tests for summability will be derived. 

For a while we shall consider only sequences {a,} with each a, 2 0; such 
sequences are called nonnegative. If {a,} is a nonnegative sequence, then 
the sequence {s,} is clearly nondecreasing. This remark, combined with 
Theorem 21-2, provides a simple-minded test for summability: 


A nonnegative sequence {a,} is summable if and only if the set of partial 
sums 6, is bounded. 


By itself, this criterion is not very helpful—deciding whether or not the set of 
all s, is bounded is just what we are unable to do. On the other hand, if some 
convergent series are already available for comparison, this criterion can be 
used to obtain a result whose simplicity belies its importance (it is the basis for 
almost all other tests). 


Suppose that 
0<a, <6, forall n. 


392 Infinite Sequences and Infinite Series 


PROOF 


Then if > ὃ, converges, so does ») Ges 
n=l 


n=I 


If 
Sn = ait: “On, 
ἔμ Ore ee Ons 
then 
O<s, <t, forall n. 


οο 


Now {i,} is bounded, since > b, converges. Therefore {s,} is bounded; 


n=) 


consequently, by the boundedness criterion ) dn converges. 


3 
fl 
—= 


Quite frequently the comparison test can be used to analyze very com- 
plicated looking series in which most of the complication is irrelevant. For 
example, 


y 2 + sin?(n + 1) 
2” + n? 


n=1 
converges because 
- 2+ sin?(n + 1) 2 
7 25 + n? 2: 


οο 3 0 
Στ ys 
25 _” 

n=1 n=] 


is a convergent (geometric) series. Similarly, we would expect the series 


. 1 
ye — 1+ sin? n3 


n=1 


0 


and 


to converge, since the nth term of the series is practically 1/2”. This last asser- 
tion really means that 


; 1 1 
fa τὰ ὗς 
noe 2 — 1 + sin? n3 2 


and this formulation shows us how to verify our conjecture. If we choose any 
number c > 1, then it is certainly true that 
1 


1 
0 < ————— <¢— large enough n. 
Fra ents ee ee ee 


THEOREM 2 (THE RATIO TEST) 


PROOF 


Infinite Series 393 


This shows that 


1 
», 2” — 1+ sin? n? 
N 


i= 


converges for some N, and of course this implies convergence of the original 
series. 
It is also possible to use the comparison test backwards. Consider, for exam- 


ple, the series 
yy n+ 1 
mas ae 


n=l 


We would expect this series to diverge, since (n + 1)/(n? + 1) is practically 
1/n for large n. To prove this, choose any number ¢ with 0 < ὁ < 1. Then 
1 n+ 1 


0 «- «ε 
n n?>-+ 1 


for large enough n. 


This shows that the series ») (n + 1)/(n? + 1) diverges, because if it con- 


n=1 
fe 2] 


verged, then by the comparison test » 1/n would also converge, and this we 
n=1 


know is false. 


The comparison test yields other important tests when we use previously 


analyzed series as catalysts. Choosing the geometric series Σ r”, the con- 
n=0 
vergent series par excellence, we obtain the most important of all tests for 


summability. 


Let a, > 0 for all n, and suppose that 


lim +t = 
πρὸ An 
Then > a, converges if r < 1. On the other hand, ifr > 1, then the terms 


n=1 
20 


a, do not approach 0, so Σ a, diverges. (Notice that it is therefore essential 
n=1 


to compute lim @n41/a, and not lim an/an41 !) 


n—- © n—- 0 


Suppose first that r < 1. Choose any number s withr < 5 < 1. The hypothesis 


: QAn+1 
lim “t- = 7 <1 


n> © an 


394 Infinite Sequences and Infinite Series 


implies that there is some N such that 


a 
—t<s5 forn>N. 


an 


This can be written 
ἄμμι S San, forn > N. 


Thus 
an4i S San, 
QN+2 < SQN41 < san, 
QN+k < san. 
Since ans* = an Σ s* converges, the comparison test shows that 
k=0 k=0 
> an = > aQN+k 
n=N k=0 
converges. This implies the convergence of the series > G@n, Which has only 
n=1 
finitely many terms not included in > ii 
n=N 
The case r > 1 is even easier. If 1 < ς <r, then there is a number JW such 


that 


a 
—=t'>s forn>N, 
an 


which means that 
anN+k = ans® > an k = 0, 1, ae ees 


This shows that the individual terms of {a,} do not approach 0, so {a,} is not 
summable. J 


As a simple application of the ratio test, consider the series y 1/n!. Letting 


n=1 


Qn = 1/n! we obtain 


1 
One ae TL: n! ae oe 
ἫΝ ΕἸ (n+1)! n+1 
πῃ! 


Thus 


ne An 


Infinite Series 395 


a 


which shows that the series > 1/n! converges. If we consider instead the 


n=! 
[- 2) 


series Σ r™ /n!, where r is some fixed positive number, then 


n=l 

yrtl 

i (n + 1) | 5 

n+ ὦ 7 no 7ὶ -- Ί ; 
n! 

SO ») r™/n' converges. It follows that 
n=1 
lim — = 0, 


a result already proved in Chapter 16 (the proof given there was based on the 
same ideas as those used in the ratio test). Finally, if we consider the series 


[+] 


y nr™ we have 


n=l 
~ (nb Ayr n+ 1 
lim -——————. = _ lim χ᾽ 


n— Ὁ nr” n— ὦ n 


ΞΞ 7, 


2 


since lim (n + 1)/n = 1. This proves that if 0 <7 <1, then ) nr” con- 


Ω 
id n=1 


verges, and consequently 
lim nr” = 0. 
(This result clearly holds for —1 <r < 9, also.) It is a useful exercise to pro- 
vide a direct proof of this limit, without using the ratio test as an intermediary. 
Although the ratio test will be of the utmost theoretical importance, as a 
practical tool it will frequently be found disappointing. One drawback of the 
ratio test is the fact that lim an4i/an may be quite difficult to determine, and 


nr Ὁ 
may not even exist. A more serious deficiency, which appears with maddening 
regularity, is the fact that the limit might equal 1. ‘The case lim an4i/an = 1 


n—- 
is precisely the one which is inconclusive: {a,} might not be summable (for 
example, if a, = 1/n), but then again it might be. In fact, our very next test 


will show that y (1/n)? converges, even though 
1 


= | 
(<5) 


lim —————— = 1 


396 Infinite Sequences and Infinite Series 


THEOREM 3 (THE INTEGRAL TEST) 


FIGURE 2 


PROOF 


This test provides a quite different method for determining convergence or 
divergence of infinite series—like the ratio test, it is an immediate consequence 
of the comparison test, but the series chosen for comparison is quite novel. 


Suppose that f is positive and decreasing on [1, ©), and that f(n) = a, for 


all n. ‘Then » a, converges if and only if the limit 
n=1 


i = lim [ἢ 


A? ὦ 


exists. 


᾿ : Α,. ᾿ : 
The existence of lim I; f is equivalent to convergence of the series 
A— 7 


2 3 4 
δέον 
Now, since f is decreasing we have (Figure 2) 


fat 1) < [PU F< fo). 


wo 


The first half of this double inequality shows that the series) an41 may be 


n=1 
οο 


: n+1 - 
compared to the series > J, f, proving that Ὶ an41 (and hence ) An) 
n=1 n=l 


n=l 


. * A . 
converges if lim [ exists. 
Av ὦ 1 


The second half of the inequality shows that the series > i f may be 
n=1 


2 


A ἌΝ te 
compared to the series Σ Qn, proving that lim Ϊ f must exist if Σ an 
n=1 ἌΡ n=1 


converges. ἢ 


Only one example using the integral test will be given here, but it settles the 


question of convergence for infinitely many series at once. If p > 0, the con- 


vergence of > 1/n” is equivalent, by the integral test, to the existence of 
n=) 


Now 


1 1 1 
AY Sper ee δι δ ee 1 
[be | G2) a pa = 
log A, p=l. 


0 


This shows that lim fe 1/x? dx existsifp > 1, butnotifp < 1. Thus > 1/nP 
A ὦ 1 


r= 


DEFINITION 


THEOREM 4 


PROOF 


Infinite Series 397 


converges precisely for p > 1. In particular, > 1/n diverges. 
n=1 
The tests considered so far apply only to nonnegative sequences, but non- 
positive sequences may be handled in precisely the same way. In fact, since 


y= -(Σ - a) 


n= n=1 


all considerations about nonpositive sequences can be reduced to questions 
involving nonnegative sequences. Sequences which contain both positive and 
negative terms are quite another story. 


If Σ a, is a sequence with both positive and negative terms, one can con- 


n=1 
- 2) 


sider instead the sequence > |an|, all of whose terms are nonnegative. Cheer- 
n=l 

fully ignoring the possibility that we may have thrown away all the interesting 

information about the original sequence, we proceed to eulogize those 

sequences which are converted by this procedure into convergent sequences. 


Ὁ οο 


The series ) a, is absolutely convergent if the series Σ |an| is con- 


n=l1 n=l1 
vergent. (In more formal language, the sequence {an} is absolutely 
summable if the sequence {|a,|} is summable.) 


Although we have no right to expect this definition to be of any interest, it 
turns out to be exceedingly important. The following theorem shows that the 
definition is at least not entirely useless. 


Every absolutely convergent series is convergent. Moreover, a series is abso- 
lutely convergent if and only if the series formed from its positive terms and the 
series formed from its negative terms both converge. 


If > Ια, converges, then, by the Cauchy criterion, 


n=1 
lim lan+1| Sn |am| =); 
Since | 
lan41 + yee + am| < lan41| ΠΣ aan + lam|; 
it follows that 
lim: ane ae Ὁ + am = 9, 


which shows that > dn converges. 


n=1 


398 Infinite Sequences and Infinite Series 


THEOREM 5 (LEIBNIZ’S THEOREM) 


t 


oO 


To prove the second part of the theorem, ] 


ΣΕ ans if an > 0 
an = : 
0, ifa, <0, 
Σ | ay; ἴα, 30 
an -- " 
0, if a, = 0, 
so that Σ a,* is the series formed from the positive terms of > an, and 
n=1 n=1 


0 


a, is the series formed from the negative terms. 


oO te) 


If > an? and > a, both converge, then 


n=1 n=1 


Σ lan| = Σ [an* — (α,7)] = Σ ant -- y ane 


1 n=1 n=l n=l 


[-..} 
also converges, so > a, converges absolutely. 
ΞΕ 1 


0 


On the other hand, if > |a,| converges, then, as we have just shown, 


n=] 


ay, also converges. Therefore 


n=1 
2 
and 
] 7 1 oO ο 
2 
n=1 n=] n=1 


both converge. J 


It follows from Theorem 4 that every convergent series with positive terms 
can be used to obtain infinitely many other convergent series, simply by put- 
ting in minus signs at random. Not every convergent series can be obtained in 
this way, however—there are series which are convergent but not absolutely 
convergent (such series are called conditionally convergent). In order to 
prove this statement we need a test for convergence which applies specifi- 
cally to series with positive and negative terms. 


Suppose that 


and that 


Infinite Series 399 


Then the series 


[+o] 
C=) a, = a; — do + a3 — ἂς + 5 -- ° 


n=1 


converges. 


PROOF Figure 3 illustrates relationships between the partial sums which we will 
establish: 


(1) SoS 54 SSeS τ ς 


(2. 53 289 2 See “oy 
(3) sp < if k is even and ὦ is odd. 


52 S4 80 ὁ Sio 512 Sir 89 51 98 53 Sy 


FIGURE 3 


To prove the first two inequalities, observe that 


(1) San42 = San + @on41 — Gen 
ὩΣ Sony since Gan+1 = aon 
(2) San43 = Sen41 — @2n+2 ἘΝ Q2n+3 


IV 


Son+1s since Qen+2 > A2n+3> 
To prove the third inequality, notice first that 


< SIn—ly since on >= 0. 


This proves only a special case of (3), but in conjunction with (1) and (2) the 
general case is easy: if & is even and ὦ is odd, choose 7 such that 


2n>k and 2n-—-1 217; 
then 
Sk <S Son S Sen—1 Ξ σι 
which proves (3). 
Now, the sequence {52,} converges, because it is nondecreasing and is 
bounded above (by s; for any odd /). Let 


a = sup {Sen} = lim Sop. 
Similarly, let 


β ΞΞ inf {Seni} = lim San+1- 


no 


If follows from (3) that a < §; since 


Son4i — San = απ ει and lim an ΞΞ 0 


Nn © 


it is actually the case that a = 8. This proves that a = β = lim sy. J 


400 Infinite Sequences and Infinite Series 


The standard example derived from Theorem 5 is the series 


1 1 1 o 8 
ΓΞ πεν, 


which is convergent, but not absolutely convergent (since Σ 1/n does not 
n=l 


converge). If the sum of this series is denoted by x, the following manipulations 
lead to quite a paradoxical result: 


ΟΣ 1 ιν. 51. τ ΞβἹ 

RS Lo op 4 5 6 

= 5 pee & aby oe gS τΤ' 5 5νϑ apneic. | BO 4 dk as, See. 
=1- 34 ar ἢ 6 8 5 10 to +7 14 τὸ + 


‘(the pattern here is one positive term followed by two negative ones) 


=(11-$ -ὐὰτἶ τ (ᾧ -- ἐ) -- ἐ Ἡ (ᾧ -- τ) -- ὦ τ΄ -- τῷ -- τς 
+ e 6 2 
Ξ ἐπὶ: ἐ-- ξ Ἡ τὖ -- τὸς Ὁ τ -- τς τῇ 
Ὡ πὸ πὰ: ἃ ) 
= δια, 


so x = χ 2, implying that x = 0. On the other hand, it is easy to see that 
x τέ 0: the partial sum 52 equals $, and the proof of Leibniz’s Theorem shows 
that x > 5». 

This contradiction depends on a step which takes for granted that opera- 
tions valid for finite sums necessarily have analogues for infinite sums. It is 
true that the sequence 


τὰ 1 De wie ee Be ec ee, BES 
{bn} ae 1, τ ὦ») 49 8: τ 6" 8) 5» 10> 12) 
contains all the numbers in the sequence 


{an} = 1, =, ἃ, =a 3 —%; 7 —%, ὁ; 16 Tr Ξι ἀϑοχος, 
In fact, {bn} is a rearrangement of {a,} in the following precise sense: each 
by, = agin) Where f is a certain function which ‘“‘permutes”’ the natural num- 
bers, that is, every natural number m is f(n) for precisely one n. In our 
example 


f(2m - 1) =3m+1 (the terms 1,4,4, ... gointo the 1st, 4th, 7th, .. . 
places), 

f(4m) = 3m (the terms —4, —$, —7z, . . . go into the 3rd, 6th, 
9th, . . . places), 

f(4m + 2) = 3m +2 (the terms —$, —4, —75, . .. go into the 2nd, 
Sth, 8th, . . . places). 

Nevertheless, there is no reason to assume that >» b, should equal Σ ἀκ 

n=l n=l 
these sums are, by definition, lim 6; + +--+ +6, and lima; - ** + + an, 


so the particular order of the terms can quite conceivably matter. The series 


THEOREM 6 


PROOF 


Infinite Series 401 


(—1)"*+1/n is not special in this regard; indeed, its behavior is typical of 
n=1 
series which are not absolutely convergent—the following result (really more 
of a grand counterexample than a theorem) shows how bad conditionally 
convergent series are. 


If = a, converges, but does not converge absolutely, then for any number ἃ 
n=1 


there is a rearrangement {b,} of {a,} such that > bn = α. 
n=] 


οο 


Let > pn denote the series formed from the positive terms of {a,} and let 


n=l 
Oo 


gn denote the series of negative terms. It follows from Theorem 4 that at 
n=1 
least one of these series does not converge. As a matter of fact, both must fail 
to converge, for if one had bounded partial sums, and the other had un- 


bounded partial sums, then the original series Σ a, would also have un- 
n=1 
bounded partial sums, contradicting the assumption that it converges. 
Now let a be any number. Assume, for simplicity, that a > 0 (the proof for 


a < Owill be a simple modification). Since the series > fn is not convergent, 
n=1 


there is a number WN such that 


N 
> fn > a 


n=l 


We will choose NV; to be the smallest N with this property. This means that 


Νι--Ἰ 
()y ) ese 
ἶνι 
but (2) Pn > α. 
2 
Then if 
Ny, 
Sy = Pn 
a 
we have 
Si — a < pyn,. 


This relation, which is clear from Figure 4, follows immediately from equation 


(1): 


Νι--Ἰ 


δί τ- α “ δ᾽ -- Σ pn = pry 


3 
i 
- 


402 Infinite Sequences and Infinite Series 


THEOREM 7 


PROOF 


Px, 
———+ 
sep $+ 
0 ρι-ἘΠ- e+ + Pye a is lec aie bas ΧΙ ΑΙ 
FIGURE 4 


To the sum S, we now add on just enough negative terms to obtain a new sum 
T, which is less than a. In other words, we choose the smallest integer Δι 
for which 


My 
Τι τ δὲ ἜΣ m<a. 


n=1 
As before, we have 
OS Lis Say: 


We now continue this procedure indefinitely, obtaining sums alternately 
larger and smaller than a, each time choosing the smallest N; or M;, possible. 
The sequence 


βι» . 0 PN My + + 6 9 IMy PN, + - » > PNy s - - 


is a rearrangement of {a,}. The partial sums of this rearrangement increase 
to S;, then decrease to 7, then increase to 52, then decrease to 7», etc. To 
complete the proof we simply note that |S, — αἱ and [7 — αἱ are less than 
or equal to py, or -- 4Μ,; respectively, and that these terms, being members 


of the original sequence {a,}, must decrease to 0, since )) a, converges. J 


n=1 


Together with Theorem 6, the next theorem establishes conclusively the 
distinction between conditionally convergent and absolutely convergent 
series. 


φο 


If > a, converges absolutely, and {d,} is any rearrangement of {a,}, then 


b, also converges (absolutely), and 
n=1 


oe 


[> 9] 
da = ) 
n=1 n 


Dis 
1 


Let us denote the partial sums of {a,} by sj, and the partial sums of {5,} 
by tn. 


οο 


Suppose that € > 0. Since > ad, converges, there is some N such that 


n=l 


LS ay | Se 


n=1 


Infinite Series 403 


oe 


Moreover, since Σ |a,| converges, we can also choose Ν so that 


n=l 


Σ lanl = (jail Ἐ τ τ + Jarl) <6 


i.e., so that 

αν αἱ + lanqe] + lanwga] + °° < 
Now choose M so large that each ofai, . . . ,a@n appearamong4,... , ba. 
Then whenever m > M, the difference tm — sy is the sum of certain ai, where 
a1... , an are definitely excluded. Consequently, 


[ἐγι = sy| < αν αἱ + lan+2| + lan+s| + oe 


Thus, if m > M, then 


δώσω 
n=l 


0 


an -- ὃν + tm — Sw | 


n=1 


<| >) a — sy 


n=1 


<e+é. 


+ ΓΞ τος δν] 


Since this is true for every 8.» 0, the series > b, converges to > an. Lo 
n=1 n=1 


prove that it converges absolutely, notice that the above argument can be 


applied to the positive terms of > an and the negative terms separately; this 


n=l 
f+ =] 


shows that the series formed from the positive and negative terms of > bn 
n=l 


i] 


each converge, so Σ b, converges absolutely. ἢ 


n=1 


The theorems of this chapter have been concerned with summability of 
sequences, but not with the actual sums. Generally speaking, there is no reason 
to presume that a given infinite sum can be “‘evaluated”’ in any simpler terms. 
However, many simple expressions can be equated to infinite sums by using 
Taylor’s Theorem. Chapter 19 provides many examples of functions for which 


n 


fy = YO 


1 


(x τ᾿ ay + Κι, «(), 


1 ΞΞ 


where lim Μ᾿, α(χἹ = 0. This is precisely equivalent to 


τ -fG) 
fx) = tim YT θα — a 
n> 0 2: 
i=0 


404 


Infimte Sequences and Infinite Series 


which means, in turn, that 


“ fO(y 
joe >! Ὁ) α — ay 


2 


As particular examples we have 


Se ay tes eg 
cose=1-S4e—* 4 ; 
matte Seep yn, 
arctan x = x τ δ SoS, Ix] < 1, 
iret ane = ee Aes eee = 1x 1, 


(Notice that the series for arctan x and log(1 + x) do not even converge for 
|x| > 1; in addition, when x = —1, the series for log(1 + x) becomes 


— τς we EA ee ee πρότασις 
᾿ 2 3 4 


which does not converge.) 
Some pretty impressive results are obtained with particular values of x: 


ae ἢ τ 
1 1 1 
δ ἀν ὩΣ τὺ ᾽ 
π ὌΠ ee 
ae nee Wane ees 
4 3° 5 aa 
log2=1-5+>-2+4 


More significant developments may be anticipated if we compare the series 
for sin x and cos x a little more carefully. The series for cos x is just the one 
we would have obtained if we had enthusiastically differentiated both sides 
of the equation 

ΤΟΝ : 
3! 5) 
term-by-term, ignoring the fact that we have never proved anything about the 
derivatives of infinite sums. Likewise, if we differentiate both sides of the 
formula for cos x formally (i.e., without justification) we obtain the formula 
cos’(x) = — sin x, and if we differentiate the formula for e* we obtain 
exp’(x) = exp(x). In the next chapter we shall see that such term-by-term 
differentiation of infinite sums is indeed valid in certain important cases. 


Infinite Series 405 


PROBLEMS 


1. 


Decide whether each of the following infinite series is convergent or 
divergent. The tools which you will need are Leibniz’s Theorem and 
the comparison, ratio, and integral tests. A few examples have been 
picked with malice aforethought; two series which look quite similar 
may require different tests (and then again, they may not). The hint 
below indicates which tests may be used. 


Ἷ sin nO 
oe 
ὦ 1-}4+4-44 
Gi) 1-$48-444-44+8-44 
ivy δ᾽ (=1pn 282 
n=l fe 
(v) > Ἐν een (The summation begins with n = 2 simply to 
zs Vn? — 1 avoid the meaningless term obtained for n = 1.) 
a. 4 
(vi) 
- Vn? +1 
= 2 
(vii) 
Fem | n! 
(viii) oe 
n=] " 
(ix) 
log n 
1 
ἰὼ (log n)* 


. 4 

(xi) " dos 2)" ε ἢ 

ἄν ἐν 1 
ὌΧ - τὴ 
Gi a 7 


[-.] 
Ἢ ἜΝ 
(xiv) sin —- 
n 


406 Infinite Sequences and Infinite Series 


Tee! 
Cw) Σ n log n 


n=2 


; 1 
evi) Σ aa 


n=2 
3 


is 1 
(xvii) Σ logs) 


n=2 


ce 3 = ἢ 


Hint: Use the comparison test for (i), (ii), (v), (vi), (ix), (x), (xi), (xili), 
(xiv), (xvii); the ratio test for (vii), (xviii), (xix), (xx); the integral test 
for (viii), (xv), (xvi). 


The next two problems examine, with hints, some infinite series that require 
more delicate analysis than those in Problem 1. 


*2. 


Oe 


(a) If you have successfully solved examples (xix) and (xx) from 


Problem 1, it should be clear that ᾿ a"n!/n” converges for a < e 


n=1 


and diverges for a > 2. For a =  ε the ratio test fails; show that 


) e"n!/n” actually diverges, by using Problem 21-10. 


n=1 
20 


(b) Decide when δ n"/a™n! converges, again resorting to Problem 
n=} 


21-10 when the ratio test fails. 


c=] 


Problem 1 presented the two series ») (log n)~* and > (log n)~", of 


n=2 n=2 


which the first diverges while the second converges. ‘The series 


y 1 
Ay (log n)!#™ 


which lies between these two, is analyzed in parts (a) and (b). 


οο 


(a) Show that is εν )γυ dy exists, by considering the series > (e/n)”. 
1 


n= 


Infinite Series 407 


(b) Show that 


- 1 
Σ (log n)loe 5 


n=32 


converges, by using the integral test. Hint: Use an appropriate 
substitution and part (a). 
(c) Show that 


y 1 
=! (log n)!08 (log 5) 


diverges, by using the integral test. Hint: Use the same substitution 
as in part (b), and show directly that the resulting integral diverges. 


(a) Let {a,} be a sequence of integers with 0 < a, < 9. Prove that 


a,10~” exists (and lies between 0 and 1). (This, of course, is the 
n=1 
number which we usually denote by O.aiaza3a, . . . .) 
(b) Suppose that 0 < x < 1. Prove that there is a sequence of integers 


{an} with 0 <a, <9 and Σ a,i0~-" = x. Hint: For example, 
n=l 
a, = [10x] (where [y] denotes the greatest integer which is < y). 
(c) Show that if {a,} is repeating, i.e., is of the form ai, a2, . . . 5 μη. 


Qi, 2. . . . , Gk, αι; 42, .. - , then Σ a,10—” is a rational number 


n=1 
(and find it). The same result naturally holds if {a,} is eventually 
repeating, i.e., if the sequence {av4x} is repeating for some NV. 
(d) Prove that if x = ») a,10~” is rational, then {a,} is eventually 
n=1 
repeating. (Just look at the process of finding the decimal expansion 
of p/g—dividing g into p by long division.) 


Suppose that {a,} satisfies the hypothesis of Leibniz’s Theorem. Use the 
proof of Leibniz’s Theorem to obtain the following estimate: 


| Σ (-- 1) Ἔ1ὰ, — [a1 πο αι tes + αν] « i 
ἡ τὶ 


Prove the following theorem (another kind of ‘“‘comparison test’’). If 
Qn, b, > Ὁ and lim ἀμ δ, = ¢ ¥ 0, then Σ a, converges if and only 


n=1 


οο 
if b, converges. 
n=1 


408 Infinite Sequences and Infimte Series 


7. Prove that if a, > 0 and lim Va, = 1, then > a, converges if r < 1, 


nm— 0 
n=l] 


and diverges ifr > 1. (The proof is very similar to that of the ratio test.) 


This result is known as the “root test.’ It is easy to construct series for 
which the ratio test fails, while the root test works. For example, the root 
test shows that the series 


b+h+ +O? + 4+ @'4+ 


converges, even though the ratios of successive terms do not approach a 
limit. Most examples are of this rather artificial nature, but the root test 
is nevertheless quite an important theoretical tool, and if the ratio test 
works the root test will also (by Problem 21-13). It is possible to elimi- 
nate limits from the root test; a simple modification of the proof shows 


that > a, converges if there issome s < 1 such that all but finitely many 


n=l 
V/ Qn, are <5, and that Σ a, diverges if infinitely many Vv a, are > 1. 


This result is known as the “‘delicate root test’’ (there is a similar delicate 


οο 


ratio test). It follows, using the notation of Problem 21-19, that y An 


n=l 


converges if lim Va, < 1 and diverges if lim ὅκα, > 1; no conclusion 


n—> © n> © 
saat oy tas 

is possible if lim Van = 1 
n> ὦ 


A sequence {a,} is called Cesaro summable, with Cesaro sum ἰ, if 


‘ ἀν ἘΣ ΞΕ ΝΣ 
lim ----- ----ο-ςς- 


Nn @ nN 


= / 


(where δὰ = ai + + + * + ay). Problem 21-12 shows that a summable 
sequence is automatically Cesaro summable, with sum equal to its 
Cesaro sum. Find a sequence which is not summable, but which zs 
Cesaro summable. 


This problem outlines an alternative proof of Theorem 7 which does not 
rely on the Cauchy criterion. 


(a) Suppose that a, > 0 for each n. Let {,} be a rearrangement of 


fan}, and let s, =ai+°+* +a, and t = di + τ + On. 
Show that for each n there is some m with s, < tm. 


ts) 


(b) Show that yan < Σ Bs: 


n=l n=l 


FIGURE 5 


*13. 


*14, 


15. 


*16. 


Infinite Series 409 


(c) Show that Σ an = Σ ὦ... 
n=1 n=l1 


(d) Now replace the condition a, > 0 by the hypothesis that > an 
n=l] 


converges absolutely, using the second part of Theorem 4. 


(a) Prove that if Σ a, converges absolutely, and {5,} is any subse- 
=1 


quence of {a,}, then > b, converges (absolutely). 
=] 


ο9 


(b) Show that this is false if Σ a, does not converge absolutely. 


n=l 


*(c) Prove that if Σ a, converges absolutely, then 


οο σο 


Prove that if Σ a, is absolutely convergent, then | Σ ὥς [44]. 


n=l n=1 


3 
Ι 
μ- 


Problem 18-28 shows that fe ” (sin x)/x dx converges. Prove that 
te |(sin x«)/x| dx diverges. 


Find a continuous function f with f(x) > 0 for all x, such that ie f(x) dx 


exists, but lim f(x) does not exist. 
“τὸ ὦ 


Let f(x) = x sin 1/x for 0 < x < 1, and let f(0) = 0. Recall the defini- 
tion of £(f, P) from Problem 15-32. Show that the set of all £(f, P) for 
P a partition of [0,1] is not bounded (thus f has “‘infinite length’’). 
Hint: Try partitions of the form 


1 | 
P= {0,994 sl, 

2η π 2(η -- 1) n-—1 2 
Let f be the function shown in Figure 5. Find f f, and also the area of 
the shaded region in Figure 5. 
In this problem we will establish the “binomial series” 


οο 


({ - χ) = ye x, |x| < 1, 


k=0 


for any a, by showing that lim R,,o(x) = 0. The proof is in several 


n— © 


steps, and uses the Cauchy and Lagrange forms as found in Problem 19-6. 


410 Infinite Sequences and Infinite Series 


*17. 


Ἐ18. 


. ; α ᾿ 
(a) Use the ratio test to show that the series > ( 4 γῇ does indeed con- 


verge for |r| < 1 (this is not to say that it necessarily converges to 


(1 + r)@). It follows in particular that lim (“) r®” = Q for |r| < 1. 


TN οὦ 


(Ὁ) Suppose first that 0 < x < 1. Show that lim Ry.o(x) = 0, by using 


Lagrange’s form of the remainder, noticing that (1 + #)7"""' < 1 
forn+1>a. 

(c) Now suppose that —1 < x < 0; the number ¢in Cauchy’s form of the 
remainder satisfies —1 < x < t < 0. Show that 


Χά + ἡ. < |x|AZ, where M = max(i, (1 + χ) ἢ), 


and 


Using Cauchy’s form of the remainder, and the fact that 


m+, ry Ξ 1) 


show that lim Rpy.o(x) = 0. 


(a) Suppose that the partial sums of the sequence {an} are bounded, 
and that {6,} is a decreasing sequence with lim ὁ, = 0. Prove that 


n— © 
οο 


> and, converges. Hint: Use Abel’s Lemma (Problem 21-20) to 
n=1 
check the Cauchy criterion. 

(b) Derive Leibniz’s Theorem from this result. 


(c) Prove, using Problem 15-31, that the series Σ (cos nx) /n converges 
n=l 


if x is not of the form 2km for any integer & (in which case it clearly 
diverges). 


οο 


Suppose {a,} is decreasing and lim a, = 0. Prove that if Σ ay, con- 
. γυ-- © n=1 


verges, then > 2"a,, also converges (the ‘‘Cauchy Condensation 


n=1 


Theorem’’). Notice that the divergence of 2. 1/nisa pees case, for if 


Σ 1/n converged, then Σ 2"(1/2") would also converge; this remark 
n=l 
may serve as a hint. 


ag be 


*20. 


*21. 


"22. 


Infinite Series 411 


. ae 
a,” and > b,” converge, then Σ dnb, converges. 
1 n=1 


n=l 


M145 


(a) Prove that if 


3 


IMoq<a 3 


oO 


a,” converges, then > a,/n converges. 


n=1 


(b) Prove that if 


Ι 
"»: 


Suppose {a,} is decreasing and each a, > 0. Prove that if δ an Con- 
n=1 


verges, then lim na, = 0. Hint: Write down the Cauchy criterion and 


n—> © 


be sure to use the fact that {a,} is decreasing. 


If Σ a, converges, then the partial sums s, are bounded, and lim a, = 


ri © 
n=l 


0. It is tempting to conjecture that boundedness of the partial sums, 


together with the condition lim a, = 0, implies convergence of Σ An. 


n— 0 "Τὶ 
This is not true, but finding a counterexample requires a little 
ingenuity. As a hint, notice that some subsequence of the partial sums 
will have to converge; you must somehow allow this to happen, without 


letting the sequence itself converge. 


The divergence of > 1/n is a particular consequence of the following 
n=] 

remarkable fact: Any positive rational number x can be written as a 

finite sum of distinct numbers of the form 1/n. Such a representation can 

always be found by the straightforward method of continually adding 

on the next number in the sequence which is not too large. For example, 


the calculation 


a fa? 23 
15 2g —~ 30 
ae -- ὁ - 13 
0 8 = 30 
1. ee 11 
30 t= 60 
11 1 

€0 <8 

oe Cee 
60 6 60 


shows that 
ξ Ξ ast re τὸ: 
Notice that the numerators 23, 13, 11, 1 are decreasing. Prove that 


every positive rational x can be written in such a way by showing that 
the numerators in this sort of calculation must always decrease, so that 


τ eventually the remainder has numerator 1. Hint: You just have to check 


that if p/¢ < 1/n, but p/q > 1/(n + 1), then the numerator of p/q — 
1(n + 1) is less than p. 


FIGURE 1 


CHAPTER 


UNIFORM CONVERGENCE AND 
POWER SERIES 


The considerations at the end of the previous chapter suggest an entirely new 
way of looking at infinite series. Our attention will shift from particular 
infinite sums to equations like 


Ct a ad x? e ee e 
έ ΞΡ 


which concern sums of quantities that depend on x. In other words, we are 
interested in functions defined by equations of the form 


70) -- filx) + fo(x) + fs(x) 6: 


(in the previous example f,(x) = x"—1/(n — 1)!). In such a situation {f,} 
will be some sequence of functions; for each x we obtain a sequence of num- 
bers {fn(x)}, and f(x) is the sum of this sequence. In order to analyze such 
functions it will certainly be necessary to remember that each sum 


μα) + fo(x) + fsx) + - τ᾿ 


is, by definition, the limit of the sequence 


Κι), filx) + fale), μα) + fee) Ἑ 6), .... 
If we define a new sequence of functions {s,} by 
SBefAterct thn 
then we can express this fact more succinctly by writing 


f(x) = lim s,(2). 


For some time we shall therefore concentrate on functions defined as limits, 
f(x) = lim f,(x), 
γπ5 ὦ 


rather than on functions defined as infinite sums. The total body of results 
about such functions can be summed up very easily: nothing one would hope 
to be true actually is—instead we have a splendid collection of counter- 
examples. The first of these shows that even if each f, is continuous, the func- 
tion f may not be! Contrary to what you may expect, the functions /, will be 
very simple. Figure 1 shows the graphs of the functions 


= es. Ors es I 
Falx) = 1, x> 1. 


412 


FIGURE 2 


FIGURE 4 


Uniform Convergence and Power Series 413 


These functions are all continuous, but the function f(x) = lim f,(x) is not 
; n— © 


continuous; in fact, 


0, O<x<1. 
lim fas) = { § ὦ 


Another example of this same phenomenon is illustrated in Figure 2; the 
functions f, are defined by 


ee 
n 
1 1 
Fa(x) = ὁ nx, πος χοὸς. -- 
n n 
1, ee 
n 


In this case, if x < 0, then f,(x) is eventually (i.e., for large enough n) equal 
to —1, and ifx > 0, then f,(x) is eventually 1, while f,(0) = 0 for all n. Thus 


—1, «<0 
lim fa(x) = 40, x =0 
2 1. ox 0; 


50, once again, the function f(x) = lim f,(x) is not continuous. 
n— οὐ 


By rounding off the corners in the previous examples it is even possible to 
produce a sequence of differentiable functions {fn} for which the function 
f(x) = lim f,(x) is not continuous. One such sequence is easy to define 

n— 0 


explicitly: 

1 

—1, Wie ease 

7 
1 1 
fa(x) = ¢ sin (Ξ) ΠΤ. 
2 n n 

1, : τ κα, 
n 


These functions are differentiable (Figure 3), but we still have 


| —1, «<0 
lim fr(x) = 1.0, x= 0 
aha: 1, x > 0. 


Continuity and differentiability are, moreover, not the only properties for 
which problems arise. Another difficulty is illustrated by the sequence ifn} 
shown in Figure 4; on the interval [0, 1/n] the graph of f, forms an isosceles 
triangle of altitude n, while f,(x) = 0 for x > 1/n. These functions may be 
defined explicitly as follows: 


414. Infinite Sequences and Infinite Serves 


FIGURE 5 


FIGURE 6 


2n?x, O<x<l 
an 
1 1 
fn(x) = 4 2n — 2n®%, — <x <- 
2n n 
1 
0, -«“χ«(͵,͵ 
ῃ 


Because this sequence varies so erratically near 0, our primitive mathe- 
matical instincts might suggest that lim f,(«) does not always exist. Never- 
N—> 00 : 


theless, this limit does exist for all x, and the function f(x) = lim f,(x) is even 
continuous. In fact, if x > 0, then f,(x) is eventually 0, τ τῇ fn(x) = 0; 
moreover, /,(0) = 0 for all n, so that we certainly have jim κεἰ) = 0. In 
other words, f(x) = a fn(x) = 0 for all x. On the other hand: the integral 


quickly reveals the strange behavior of this sequence; we have 


fy falx) ax 


1 
2) 


Ι 


but 


0. 


fy fle) ax 
Thus, 
lim " ᾿ Inlet) axes i lim fn(x) dx. 


This particular sequence of functions behaves in a way that we really never 
imagined when we first considered functions defined by limits. Although it 15 
true that 

f(x) = lim f,(x) for each x in [0, 1], 


the graphs of the functions ἔμ do not ‘‘approach”’ the graph of f in the sense of 
lying close to it—if, as in Figure 5, we draw a strip around f of total width 2¢ 
(allowing a width of ¢ above and below), then the graphs of f, do not lie com- 
pletely within this strip, no matter how large an n we choose. Of course, for 
each x there is some WN such that the point (x, fn (x)) lies in this strip forn > N; 
this assertion just amounts to the fact that lim f,(«) = f(x). But it 15 necessary 


to choose larger and larger N’s as x is chosen closer and closer to 0, and no one 
N will work for all x at once. | 

The same situation actually occurs, though less blatantly, for each of the 
other examples given previously. Figure 6 illustrates this point for the. sequence 


Ἄς OO eT 
fala) = [9 x> 1. 


A strip of total width 2¢ has been drawn around the graph of f(x) = lim f, (x). 


If ¢ < 4, this strip consists of two pieces, which contain no points with second 


FIGURE 7 


DEFINITION 


THEOREM 1 


Uniform Convergence and Power Series 415 


coordinate equal to 4; since each function f, takes on the value $, the graph 
of each f, fails to lie within this strip. Once again, for each point x there is 
some N such that (x, f,(x)) lies in the strip for n > N; but it is not possible to 
pick one N which works for all x at once. 

It is easy to check that precisely the same situation occurs for each of the 
other examples. In each case we have a function f, and a sequence of functions 
{fn}, all defined on some set A, such that 


f(x) = lim f,(x) for all x in A. 
This means that 


for all ¢ > 0, and for all x in A, there is some MN such that if n> N, 
then | f(x) — fr(x)| < ε. 
But in each case different N’s must be chosen for different x’s, and in each case 
it is not true that 


for all ¢ > 0 there is some MN such that for all x in A, if n > N, then 


If(x) — fa(x)| < ε. 


Although this condition differs from the first only by a minor displacement 
of the phrase “‘for all x in A,” it has a totally different significance. If a 
sequence { f,} satisfies this second condition, then the graphs of f, eventually 
lie close to the graph of f, as illustrated in Figure 7. This condition turns out to 
be just the one which makes the study of limit functions feasible. 


Let {fn} be a sequence of functions defined on A, and let f be a function 
which is also defined on A. Then f is called the uniform limit of {f,} om A 
if for every € > 0 there is some N such that for all x in A, 


ifn > N, then | f(x) — fn(x)| < ξ. 


We also say that { f,} converges uniformly to f on A, or that f, approaches 
f uniformly on A. 


As a contrast to this definition, if we know only that 


f(x) = lim f,(«) for each x in A, 


then we say that { f,} converges pointwise to f on A. Clearly, uniform con- 
vergence implies pointwise convergence (but not conversely!). 

Evidence for the usefulness of uniform convergence is not at all difficult to 
amass. Integrals represent a particularly easy topic; Figure 7 makes it almost 
obvious that if {f,} converges uniformly to f, then the integral of f, can be 
made as close to the integral of f as desired. Expressed more precisely, we have 
the following theorem. 


Suppose that {f,} is a sequence of functions which are integrable on [a, 6], 
and that {f,} converges uniformly on [a, 6] to a function f which is integrable 


416 Infinite Sequences and Infinite Series 


PROOF 


THEOREM 2 


PROOF 


on {a, ὁ]. Then 
[Of = iim ee 
Let ¢ > 0. There is some Ν᾽ such that for all n > N we have 


f(x) — fa(x)| < € for all x in [a, δ]. 
Thus, if n > N we have 


[ΓΙ dx — [ fale) ax | 


[ὕω — fa) ae | 
Sf 1f®) τ Κι & 
< if ξ dx 

é(b — a). 


Since this is true for any € > 0, it follows that 


[f= lim [Pfu 


The treatment of continuity is only a little more difficult, involving an 
‘‘e/3-argument,” a three-step estimate of |f(x) — f(x + )|. If {fa} is a 
sequence of continuous functions which converges uniformly to f, then there 
15. some n such that 


(4) ὦ — f.)| < τ᾽ | 
(2) |f* +4) —folx +4) < τ 

Moreover, since f, is continuous, for sufficiently small ᾧ we have 
(5) |falx) — fale + h)| < τ᾽ 


It will follow from (1), (2), and (3) that | f(x) — f(x + h)| « ¢. In order to 
obtain (3), however, we must restrict the size of |A| in a way that cannot be 
predicted until n has already been chosen; it is therefore quite essential that 
there be some fixed n which makes (2) true, no matter how small || may be— 
it 1s precisely at this point that uniform convergence enters the proof. 


Suppose that {f,,} is a sequence of functions which are continuous on [a, ὁ], 
and that {/,} converges uniformly on [a, ὁ] to f. Then f is also continuous on 
[a, ὁ]. 


For each x in [a, ὁ] we must prove that f is continuous at x. We will deal only 
with x in (a, 6); the cases x = a and x = ὦ require the usual simple modi- 
fications. 


Uniform Convergence and Power δεγιθς 417 


Let ¢ > 0. Since {/,} converges uniformly to f on [a, ὁ], there is some n 
such that 


If) — fay) < = for all y in [α, δ]. 
In particular, for all 4 such that x + ἡ is in [a, δ], we have 
(1) 0) τ AG) «Ὁ 
2) [fle τὸ — fale + <= 
Now f, is continuous, so there is some 6 > 0 such that for [ἡ] < ὃ we have 
(3) [κι — fale + A) «τ 
Thus, if || < ὃ, then 
[f(x + ἃ) — f)| 


= [f(x th) — fale + A) + fale +h) — fae) + fale) — 70) 
< [97 +h) — fale + 4}} + [fale +h) — fa) + Lf) — FI 


ξ ξ ξ 
ee: ΞΞ ec 
31373 


ει 


This proves that f is continuous at x. J 


FIGURE 8 


After the two noteworthy successes provided by Theorem 1 and Theorem 2, 
the situation for differentiability turns out to be very disappointing. If each 
fn is differentiable, and if {f,} converges uniformly to /, it is still not neces- 
sarily true that f is differentiable. For example, Figure 8 shows that there is a 
sequence of differentiable functions {f,} which converges uniformly to the 


418 Infinite Sequences and Infinite Series 


THEOREM 3 


PROOF 


function f(x) = |x|. Even if f zs differentiable, it may not be true that 
f(x) = lim f,’(x); 


this is not at all surprising if we reflect that a smooth function can be approxi- 
mated by very rapidly oscillating functions. For example (Figure 9), if 


dalx)-= sin(n?x), 
n 


then {f,} converges uniformly to the function f(x). = 0, but 
fn’ (x) = π cos(n2x), 


and lim n cos(n’x) does not always exist (for example, it does not exist if 


x = 0). 
1 
| 1 tn 
eee TTT AAA AA 
AURORE A AAA RARE Π|}} {|| || ὺΠῚ 1} ||} 
~1 
FIGURE 9 


Despite such examples, the Fundamental Theorem of Calculus practically 
guarantees that some sort of theorem about derivatives will be a consequence 
of Theorem 1; the crucial hypothesis is that {f,’} converge uniformly (to 
some continuous function). 


Suppose that {f,} is a sequence of functions which are differentiable on 
[a, δ), and that {f,} converges (pointwise) to f. Suppose, moreover, that 
{fa’} converges uniformly on [a, 5] to some continuous function g. Then f is 
differentiable and 


f(x) = lim f,’(). 
Applying Theorem 1 to the interval [a, x], we see that for each x we have 


fie = lien fr fa 
lim [7.0 — fa(a)] 
- 7.5) a) 


DEFINITION 


COROLLARY > 


Uniform Convergence and Power Serves 419 


Since g is continuous, it follows that f’(x) = g(x) = lim f,(x) for all x in 
the interval [a, ὁ]. Jf 
Now that the basic facts about uniform limits have been established, it is 
clear how to treat functions defined as infinite sums, 
f(x) = fil) ἘλΟ) ἘΔ) τ τ. 
This equation means that 


f(x) = lim μα) + + + fa); 
our previous theorems apply when the new sequence 


(AE Guero i Gaus tees freee 


converges uniformly to f. Since this is the only case we shall ever be inter- 
ested in, we single it out with a definition. 


οο 


The series > fn converges uniformly (more formally: the sequence { 
n=1 


is uniformly summable) to f on A, if the sequence 


Fuss ay 11 + fe + fs, i «fee 


converges uniformly to f on 4. 


We can now apply each of Theorems 1, 2, and 3 to uniformly convergent 
series; the results may be stated in one common corollary. 


Let ) f, converge uniformly to f on [a, ὁ]. 
ΞΊ 


n= 


(1) Ifeach f, is continuous on [a, 61, then f is continuous on [a, bj. 
(2) If f and each f, is integrable on [a, Ὁ], then 


orb “pb 
ae eh 
n=1 
Moreover, if > fn converges (pointwise) to f on [a, bj, and ) f,’ converges 
n=l n=1 


uniformly on [a, 6] to some continuous function, then 


(3) f(x) = Σ fn'(x) for all x in [α, 6}. 


420 Infinite Sequences and Infinite Series 


PROOF 


THEOREM 4 
(THE WEIERSTRASS M-TEST) 


PROOF 


(1) If each f, is continuous, then so iseach fi + - + > + fy, and fis the 
uniform limit of the sequence fi, fi: + fo, fi t+ fe tfs, ... ,s0 fis 
continuous by Theorem 2. 

(2) Since fi, fa + fo, fir + fe + fs, . . . converges uniformly to f, it 
follows from Theorem 1 that 


[of = lim [Οὐ ἘΠ. τ. +f) 
lim (fit " 


[9] 


=) ds 


n=1 


(3) Each function fi +--+: +f, is differentiable, with derivative 
fi’ Ste nee Fn’ and Poe ti ΞΕ fr’, 1' a fa’ Ὑ fa’ - + + CON- 
verges uniformly to a continuous function, by hypothesis. It follows 
from ‘Theorem 3 that 


f(s) = lim 706) + + + ἘΜΌΣ] 


y fa'(x). I 


At the moment this corollary is not very useful, since it seems quite difficult 
to predict when the sequence fi, fi + fo, fi + fe+ fs, . . . will converge 
uniformly. The most important condition which ensures such uniform con- 
vergence is provided by the following theorem; the proof is almost a triviality 
because of the cleverness with which the very simple hypotheses have been 
chosen. 


Let {/,} be a sequence of functions defined on A, and suppose that {M,} isa 
sequence of numbers such that 


lfn(x)| < M, for all xin A. 


eo 


Suppose moreover that Σ M, converges. Then for each x in A the series 


n=l 


Σ fa(x) converges (in fact, it converges absolutely), and > fn converges 


n=l n=1 


uniformly on A to the function 


70) Ξ Σ fale. 


For each x in A the series Σ |fn(x)| converges, by the comparison test; 


n=1 ¢ 


Uniform Convergence and Power Series 421 


consequently > fn(x) converges (absolutely). Moreover, for all x in A we have 


n=l 


70) —A@ +---t+4A@1=| Y Ae) | 


n=N+1 
< ) | fn (x) 
n=N+1 
< M,. 
n=N+1 
Since > M,, converges, the number 2 M,, can be made as small as 
n=1 n=N+1 


desired, by choosing WN sufficiently large. J 


FIGURE 10 
fie) = {2x} 

The following sequence {f,} illustrates a simple application of the Weier- 
strass M-test. Let {x} denote the distance from x to the nearest integer (the 
graph of f(x) = {x} is illustrated in Figure 10). Now define 


1 
n(X) = —— {10x}. 
fale) = τος [105] 
The functions f; and f, are shown in Figure 11 (but to make the drawings 
simpler, 10” has been replaced by 2”). This sequence of functions has been 


defined so that the Weierstrass M-test automatically applies: clearly 


lfn(x)| < τς for all x, 


FIGURE 11 = 
and > 


n=1 


1/10” converges. Thus > fn converges uniformly; since each fn is 
n=1 
continuous, the corollary implies that the function 


Ὁ 


7) = δ fate = y τ {105χ} 


n=l 


422 Infinite Sequences and Infinite Series 


THEOREM 5 


PROOF 


is also continuous. Figure 12 shows the graph of the first few partial sums 
μ᾿ + fn. As n increases, the graphs become harder and harder to 


draw, and the infinite sum Σ fn is quite undrawable, as shown by the follow- 
πὶ 

ing theorem (included mainly as an interesting sidelight, to be skipped if you 

find the going too rough). 


The function 


Ore » τὸς {10%} 


is continuous everywhere and differentiable nowhere! 


We have just shown that f is continuous; this is the only part of the proof which 
uses uniform convergence. We will prove that f is not differentiable at a, for 
any a, by the straightforward method of exhibiting a particular sequence 
{hm} approaching 0 for which 


lim 162 t Am) — Fla) 


M— οὦ him 


does not exist. It obviously suffices to consider only those numbers a satisfying 
0O<a<l. 
Suppose that the decimal expansion of a is 


a= 0.a3a203a4 δὰ, halt Pie = τὸ 


Let him = 107” if @m τέ 4 οὐ 9, but lethn = —107~” if am = 4 or 9 (the reason 
for these two exceptions will appear soon). Then 


fla + Am) — fla) _ Σ 1 {10"(a + λ,,}} — {10a} 


him 10” +107" 


n=} 
t+ =] 


Ξ » + 10"-"[{10"(a + Am)} — [10π4}}. 


n=l 


This infinite series is really a finite sum, because if n > σι, then 10"A» is an 
integer, so | 
{10"(a + hm)} — {10%a} = 0. 


On the other hand, for n < m we can write 


10"a = integer + O.dn41dni2Gn43 -. + Im... - 
10"(a + hm) = integer + O.dniidny2Qna3 . .«. (Qm £1)... 


(in order for the second equation to be true it is essential that we choose 
hm = —107~” when a, = 9). Now suppose that 


0 οὐ δα tote g 4-4. yo 4 eS. 


Uniform Convergence and Power Series 423 


Then we also have 


O.dn4idns2dng3 » - - (dm + i1)...<4 
(in the special case m = n + 1 the second equation is true because we chose 
hm = —10~" when a, = 4). This means that 

{10"(a + hm)} — {10"a} = +10"™, 
and exactly the same equation can be derived when 0.dn414n42@n43 - - > 7 $. 


Thus, for n < m we have 
10"-"[{10"(a + hm)} — {10"a}] = £1. 


In other words, 
fla + hm) — fla) 
hm 
‘s the sum of m — 1 numbers, each of which is +1. Now adding +1 or —1 
to a number changes it from odd to even, and vice versa. The sum of m — 1 
numbers each +1 is therefore an even integer if m is odd, and an odd integer if m 
is even. Consequently the sequence of ratios | | 


fla + hm) = f(a) 
hm 


cannot possibly converge, since it is a sequence of integers which are alter- 
nately odd and even. §f 


In addition to its role in the previous theorem, the Weierstrass M-test is an 
ideal tool for analyzing functions which are very well behaved. We will give 
special attention to functions of the form 


f(x) = y ax(x — a)", 


n=O 


Ath tfathe 


which can also be described by the equation 
fi +fetfs f(x) = Y fae 

n=0 
for fn(x) = an(x — a)”. Such an infinite sum, of functions which depend only 
on powers of (x — a), is called a power series centered at a. For the sake of 


μαὶ = ty {16x} 
LL S7™S 


te) simplicity, we will usually concentrate on power series centered at 0, 
FIGURE 12 ω 
fs) τ δ ans" 
n=O 


One especially important group of power series are those of the form 


424 Infinite Sequences and Infinite Series 


THEOREM 6 


where f is some function which has derivatives of all orders at a; this series is 
called the Taylor series for f at a. Of course, it is not necessarily true that 


= (n) a 
One ye mas 


this equation holds only when the remainder terms satisfy lim Ry.a(x) = 0. 


We already know that a power series Σ a,x” does not necessarily converge 
n=0 


for all x. For example the power series 


converges only for |x| < 1, while the power series 


x2 x3 x4 xe 
Xx a eee ee i, eck areca 2 .ο.- 68 
2 ἐὴ 3 4 ε 5 = 
It is even possible to produce a power series 


converges only for —1 <x < 1. 
= 0.. For example, the power series 


which converges only for x 


does not converge for x # 0; indeed, the ratios 


(n + 1)!(x"") 
nix” 


= (n+ 1)x 


a] 
are unbounded for any x ¥ 0. If a power series Σ a,x" does converge for 
=0 


nr 


some x9 ~ 0 however, then a great deal can be said about the series » Anx” 
n=O 


for |x| < |xol. 


Suppose that the series 


ο 


f(xo) = Σ AnXo” 


n=0 


converges, and let a be any number with 0 < a < |xo|. Then on [—a, a] the 
series 


ο9 


f(x) = δῷ" 


n=0 


converges uniformly (and absolutely). Moreover, the same is true for the 


Uniform Convergence and Power Series 425 


series 


g(x) = ) Nanx”™ 3, 
n=] 


Finally, f is differentiable and 


f(x) = ) age 


for all x with |x| < |xol. 


PROOF Since ) GnXo" converges, the terms a,%o” approach 0. Hence they are surely 
n=0 


bounded: there is some number M such that 
lanXo"| = [ἀφ] * |xo"| < M for all x. 


Now if χ is in [—a, a], then |x| < |a|, so 
i ᾿ 


Ια, “ΙΧ 


lanx"| = 
< |an| " [α" 


In 
[51 /¢0|" bas | (this is the clever step) 
Χο 


α n 


<M 


Xo | 


But ja/xo| < 1, so the (geometric) series 


δ μ| 4 τ lst 
a4, ea & | +0 | 


converges. Choosing Μ᾿ |a/xo|” as the number M, in the Weierstrass M-test, 


it follows that > a,x” converges uniformly on [—a, a]. 


n=0 
To prove the same assertion for g(x) = Σ na,x" ποῖος that 
n=1 
lna,x” "| = nlan| " [χἘΠῚ 

< njan| " ja" 
|an| ΤΥ Τὰ 

Si Ὁ Ἢ Pe asd 
| 
M | a |" 

a ar eo aaa Ge 
Ια] | Xo 


Since |a/xo| < 1, the series 


. M a |" M | q |” 
ταν 
ay lal | xo αἱ A, 0 


converges (this fact was proved in Chapter 22 as an application of the ratio 


426 Infinite Sequences and Infinite Serves 


test). Another appeal to the Weierstrass /-test proves that δ Nanx"—' con- 
n=1 


verges uniformly on [—a, a]. 
Finally, our corollary proves, first that g is continuous, and then that 


2 


f'(*) = g(x) = > na,x"! for x in [—a, a]. 


n=1 


Since we could have chosen any number a with 0 < a < |xol, this result holds 
for all « with |x| < |xo|. ff 


Theorem 6, applied to the infinite series 


: x χϑ x! x 
δ a ὩΣ δι: a Ἶ 
x? x x6 x 
cos x = oT τὰν 31 ; 
ἐπ δὰ eee 
τ ee ake a 


yields precisely the results which are expected. Each of these converges for any 
xo, hence the conclusions of Theorem 6 apply for any x: 


bs "(x) τος oma 3x? 5x4 ‘cae, so 68 ε - 
sin = πω ἢ = COS x, 
2x 4χϑ 6x? 
, — —as paren ney — —— . Φ . _— —_ 
cos’(x) = os + τ τ + sin x, 
2 3x? 
exp’(x) = 1 eae - + + = exp(x). 


For the functions arctan and f(x) = log(1 + x) the situation is only slightly 
more complicated. Since the series 


x? x? x! 


arctanx =x—-—-+7=--7 
3 5 if 
converges for x» = 1, it also converges for |x| < 1, and 
; 2 4 6 1 
arctan’(x) = 1 -- x? -᾿ χ' -- χε +++ = ——— for |x| < 1. 
1 + x? 
In this case, the series happens to converge for x = —1 also. However, the 
formula for the derivative is not correct for x = 1 or x = —1; indeed the 


series 

1— x2 + χ — eh + e ee 
diverges for x = 1 and x = —1. Notice that this does not contradict Theorem 
6, which proves that the derivative is given by the expected formula only for 
lx] < |xol. 


Uniform Convergence and Power Series 427 


Since the series 


χ' χϑ he κχϑ 
log(1 ies Sealine ee 
og(i +x) =x aa ΠΝ 


converges for x» = 1, it also converges for |x| < 1, and 


-ῦ τς gor trey =1—-x+x? -- χ + τ for |x| < 1. 
1+. 
In this case, the original series does not converge for x = —1; moreover, the 
differentiated series does not converge for x = 1. 
All the considerations which apply to a power series will automatically 
apply to its derivative, at the points where the derivative is represented by a 
power series. If 


f(x) = Σ nx” 


n=0 


converges for all x in some interval (—R, R), then Theorem 6 implies that 
fix) = nage 
pa 


for all x in (—R, R). Applying Theorem 6 once again we find that 

7) = δ n(n — I)anx™™, 

2 
and proceeding by induction we find that 
f(x) = Σ πα -- 1) τ τ. QF Dagxt 

n=k 
Thus, a function defined by a power series which converges in some interval 
(—R, R) is automatically infinitely differentiable in that interval. Moreover, 
the previous equation implies that 


si (0) ae Klan, 
so that | 
at” ©) 
aaa 


ak 


In other words, a convergent power series centered at Ο is always the Taylor sertes at O 
of the function which tt defines. 

On this happy note we could easily end our study of power series and 
Taylor series. A careful assessment of our situation will reveal some un- 
explained facts however. 

The Taylor series of sin, cos, and exp are as satisfactory as we could desire; 
they converge for all x, and can be differentiated term-by-term for all x. The 
Taylor series of the function f(x) = log(1 + x) is slightly less pleasing, because 


428 Infinite Sequences and Infinite Series 


it converges only for —1 <x < 1, but this deficiency is a necessary conse- 
quence of the basic nature of power series. If the Taylor series for f converged 
for any x with |x| > 1, then it would converge on the interval (— |xol, |xo|), and 
on this interval the function which it defines would be differentiable, and thus 
continuous. But this is impossible, since it is unbounded on the interval 


(—1, 1), where it equals log(1 + ~). 


If |x| > 1 the Taylor series does not converge at all. Why? What unseen 
obstacle prevents the Taylor series from extending past 1 and —1? Asking 
this sort of question is always dangerous, since we may have to settle for an 
unsympathetic answer: it happens because it happens—that’s the way things 
are! In this case there does happen to be an explanation, but this explanation 
is impossible to give at the present time; although the question is about real 
numbers, it can be answered intelligently only when placed in a broader 
context. It will therefore be necessary to devote two chapters to quite new 
material before completing our discussion of Taylor series in Chapter 26. 


The Taylor series for arctan is more difficult to comprehend—there seems 
to be no possible excuse for the refusal of this series to converge when |x| > 1. 
This mysterious behavior is exemplified even more strikingly by the function 
f(x) = 1/(1 + x?), an infinitely differentiable function which is the next best 
thing to a polynomial function. The Taylor polynomial of f is given by 


1 
1+ x? 


= 1 — xe? + χί — x6 + x8 — © secs 


f(x) = 


PROBLEMS 


1, 


For each of the following sequences { f,}, determine the pointwise limit 
of {ζω} (if it exists) on the indicated interval, and decide whether {f,} 


converges uniformly to this function. 


(i) fa(x) = Vx, on [0, 1]. 

DOS Ἢ : Ἐπὶ on ἴα, ὁ], and on R. 
(1) 7.0) = > on (1, &). 

(iv) f(x) = ec", on [—1, 1]. 

(v) f(x) = =~ on R. 


Find the Taylor series at 0 for each of the following functions. 


(i) fla) --- a #0. 


(ii) f(x) = log(x —a), a 590. 


Uniform Convergence and Power Series 429 


Gil) f(s) = Go 


“πὰ 


ἀν) 70) =§ Gos 


(v) f(*) = arcsin x. 


= (1 — x)~/*% (Use Problem 19-6.) 


Find each of the following infinite sums. 


: Ge, i ee 

(i) Pee a ae a 

(i) 1—x3 + %8—x8+--+-+. Hint: Whatisl —x+tx?—x?+-°°? 
re x x x , ᾿ : 
(iii) pare + ΗΝ oe + ++ for |x| < 1. Hint: Differentiate. 
; | re | 1 1 

Ὧν; 2.1 3.9 4-3 Ν᾿ τὰν ΝΙΝ os Problem 10.) 

εν) Mice πο ἘΞ io ΚΣ 


If f(x) = (sin x)/x for x γέ 0 and f(0) = 1, find f(0). Hint: Find the 
power series for ἢ. 


In this problem we deduce the binomial series (1 + x)* = > (“) x"; 


n=0 


|x| < 1 without all the work of Problem 22-16, although we will use a 


fact established in part (a) of that problem—the series 3 a x" does 
n 


converge for |x| < 1. 


(a) Prove that (1 + x)f’(x) = af(x) for |x| < 1. 

(b) Now show that any function f satisfying part (a) is of the form 

f(x) = c(1 + x)* for some constant ¢, and use this fact to establish 

the binomial series. Hint: Consider g(x) = f(x)/(1 + x). 

Suppose that f(x) = > a,x” converges for all x in some interval 

n=O 

(—R, R) and that f(x) = 0 for all x in (—R, R). Prove that each 

a, = 0. (If you remember the formula for a, this is easy.) 

(b) Suppose we know only that f(x.) = 0 for some sequence {x,} with 
lim x, = 0. Prove again that each a, = 0. Hint: First show that 


n— 0 
f(0) = ao = 0; then that f’(0) = a, = 0, etc. 

This result shows that if f(x) = e71/*" sin 1/x for x ~ 0, then f cannot 
possibly be written as a power series. It also shows that a function 
defined by a power series cannot be 0 for x < 0 but nonzero for 
x > 0—thus a power series cannot describe the motion of a particle 
which has remained at rest until time 0, and then begins to move! 


(a 


ee” 


430 Infinite Sequences and Infinite Series 


*8. 


*10. 


(c) Suppose that f(x) = Σ a,x” and g(x) = > b,x” converge for all x 
n=0 n=0 
in some interval containing 0 and that f(tm) = g(tm) for some 
sequence {/,,} converging to 0. Show that a, = ὁ, for each n. In 
particular, a function can have only one representation as a power series 
centered at 0. 


οο 


Prove that if f(x) = a,x” is an even function, then a, = 0 for n odd, 
n=0 

and if f is an odd function, then a, = 0 for n even. 

Recall that the Fibonacci sequence {a,} is defined by ai = a, = 1, 


An41 = Qn + Ani. 


(a) Show that an41/a, < 2. 
(b) Let 


f(x) = Σ ἀρχὴ = 1 text 2x? + 3x8 4 page 


n=l 


Use the ratio test to prove that f(x) converges if |x| < 1/2. 
(c) Prove that if |x] < 1/2, then 


jose! 


iene 


Hint: This equation can be written f(x) — xf(x) — x?f(x) = 1. 

(d) Use the partial fraction decomposition for 1/(x? + x — 1), and the 
power series for 1/(x — a), to obtain another power series for /. 

(e) It follows from Problem 6 that the two power series obtained for f 
must be the same. Use this fact to show that 


᾿ Ἢ: vey 7 ( es =) 
a Mie ef oN i ἡ 2 . 
an = V/s 


Show that the power series for f(x) = log(1 — x) converges only for 
—1 <x < 1, and that the power series for g(x) = log[(1 + x)/(1 — x)] 
converges only for x in (—1, 1). 


_ Suppose that > a, converges. We know that the series f(x) = > a,x” 


n=0 n=O 
must converge uniformly on [—a, a] for 0 < a <1, but it may not 
converge uniformly on [—1, 1]; in fact, it may not even converge at the 
point —1 (for example, if f(x) = log(1 + x)). However, a beautiful 
theorem of Abel shows that the series does converge uniformly on [0, 1]. 


Consequently, f is continuous on [0,1] and, in particular, > ay, = 


n=0 


11. 


12. 
13. 
*14, 


15. 


16. 


Uniform Convergence and Power Sertes 431 


lim y a,x. Prove Abel’s Theorem by noticing that if am + °° * + an| 


ἀπ 0 
<¢, then lamx™ + °° * + anx™| « ε, by Abel’s Lemma (Problem 
21-20). 


A sequence {a,} is called Abel summable if lim Σ a,x" exists; Problem 


21” S20 
10 shows that a summable sequence is necessarily Abel summable. Find 
a sequence which is Abel summable, but which is not summable. Hint: 
Look over the list of Taylor series until you find one which does not con- 
verge at 1, even though the function it represents is continuous at 1. 
In Theorem 3 we assumed only that { f,} converges pointwise to f. Show 
that the remaining hypotheses ensure that {f,} actually converges 
uniformly to f. | 
(a) Suppose that {/,} is a sequence of bounded (not necessarily con- 
tinuous) functions on [a, ὁ] which converge uniformly to f on [a, δ]. 
Prove that f is bounded on ἴα, ὁ]. 
(b) Find a sequence of continuous functions on [a, ὁ] which converge 
pointwise to an unbounded function on [a, 6]. 
Suppose that f is differentiable. Prove that the function f’ is the pointwise 
limit of a sequence of continuous functions. (Since we already know 
examples of discontinuous derivatives, this provides another example 
where the pointwise limit of continuous functions is not continuous. ) 
Find a sequence of integrable functions { fn} which converges to the 
(nonintegrable) function f that is 1 on the rationals and 0 on the irra- 
tionals. Hint: Each f, will be 0 except at a few points. 
This problem outlines a completely different approach to the integral; 
consequently, it is unfair to use any facts about integrals learned 
previously. 


(a) Let s be a step function on [a, 4], so that s is constant on (é;_1, ¢:) for 
ve 


some partition {to, . . . , tn} of [a, δ]. Define ΄: 5 as Σ si(t; — ἐμ.) 
{ΞῚ 

where s,; is the (constant) value of s on (¢i-1, t;). Show that this 

definition does not depend on the partition {fo, . . - » bey: 

(b) A function f is called a regulated function on (a, bl ifit is the uniform 
limit of a sequence of step functions {s,} on [a, δ]. Show that in this 
case there is, for every € > 0, some N such that for m,n > N we have 
lsn(x) — Sm(x)| < € for all x in [a, ὁ]. 

(c) Show that the sequence of numbers { [ sn} will be a Cauchy sequence. 


(4) Suppose that {ἐμ} is another sequence of step functions on [a, ὁ] 
which converges uniformly to f. Show that for every € > 0 there is 
an N such that for n > N we have |sn(x) — ta(x)| < ε for x in [a, ὁ]. 


b ὃ ; 
(e) Conclude that lim , ὅπ Ξ lim [ tn. This means that we can 


Ni? © 


432 Infinite Sequences and Infinite Series 


Ἐ17, 


define f, ᾿ f to be lim [ ° 5, for any sequence of step functions {s,} 


converging uniformly to f. The only remaining question is: Which 
functions are regulated? Here is a partial answer. 

*(f) Prove that a continuous function is regulated. Hint: To find a step 
function s on [a, 6] with | f(x) — s(x)| < ¢ for all x in [a, ὁ], consider 
all y for which there is such a step function on [a, y]. 


Find a sequence {f,} approaching f uniformly on [0,1] for which 
lim (length of f, on [0, 1]) # length of f on [0, 1]. (Length is defined in 


Problem 15-32, but the simplest example will involve functions the 
length of whose graphs will be obvious.) 


CHAPTER 


COMPLEX NUMBERS 


With the exception of the last few paragraphs of the previous chapter, this 
book has presented unremitting propaganda for the real numbers. Neverthe- 
less, the real numbers do have a great deficiency—not every polynomial 
function has a root. The simplest and most notable example is the fact that no 
number x can satisfy x? + 1 = 0. This deficiency is so severe that long ago 
mathematicians felt the need to “invent” a number 7 with the property that 
i? - 1 = 0. Fora long time the status of the ‘“‘number”’ 7 was quite mysterious: 
since there is no number x satisfying x? + 1 = 0, it is nonsensical to say “let 
i be the number satisfying 12 -+ 1 = 0.” Nevertheless, admission of the 
“imaginary” number ὦ to the family of numbers seemed to simplify greatly 
many algebraic computations, especially when “complex numbers” a + δὲ 
(for a and ὁ in R) were allowed, and all the laws of arithmetical computation 
enumerated in Chapter 1 were assumed to be valid. For example, every 
quadratic equation 
αχϑ - bx +c=0 (a #0) 


can be solved formally to give 


μας EN Ie τὰ _ =b — VBP 4ac 


2a 2a 


If 52 — 4ac > 0, these formulas give correct solutions; when complex numbers 
are allowed the formulas seem to make sense in all cases. For example, the 
equation 

xe+xeti=0 


has no real root, since 
e+e ti=(«¢+4$?+2>0, forall x. 
But the formula for the roots of a quadratic equation suggest the “solutions” 


Sieve Si eevy 3 
x = —————_ and x = ———;3; 
2 2 


if we understand V —3 to mean ν΄. (—1) = V3-V—-1 = V3 i, then 
these numbers would be 


V3 i “2. 


and --- --- --- 2. 


1 
το eer ae 2 2 


It is not hard to check that these, as yet purely formal, numbers do indeed 


433 


434 Infinite Sequences and Infinite Series 


satisfy the equation 
χ eee TsO, 


It is even possible to ‘‘solve” quadratic equations whose coefficients are them- 
selves complex numbers. For example, the equation 
ete+1+7=0 
ought to have the solutions 
Ξ] νι 91 1. Steves 4s 
2 2 


x= 


where the symbol V—3 — 47 means a complex number a + βὲ whose 
square is —3 — 41. In order to have 


(a + Br)? = a? — 8B? 4+ 2αβὶὲ = --3 — 4ϊ 

we need 
a? ae! B? 
2αβ 


—3, 


These two equations can easily be solved for real a and 8; in fact, there are 
two possible solutions: 
a= 1 a=--—1 


B= --2 B = 2. 


Thus the two “square roots” of —3 — 4: are 1 — 2i and —1 + 21. There is 
no reasonable way to decide which one of these should be called V-3—- 41, 
and which — V —3 — 47; the conventional usage of x makes sense only for 
real x > 0, in which case x denotes the (real) nonnegative root. For this 
reason, the solution 
---. ν΄ --3 -- 4 

2 
must be understood as an abbreviation for: 


ΞΞ1 es wa J 


x= » where 7 is one of the square roots of —3 — 42. 


With this understanding we arrive at the solutions 


Pen tas Sou eee 
2 3 
ema. oS = 1 ti 


as you can easily check, these numbers do provide formal solutions for the 
equation 
| Pwrebiti= 0. 


Complex Numbers 435 


For cubic equations complex numbers are equally useful. Every cubic 
equation 
ax? + bx? +ex+d=0 (a #0) 


with real coefficients a, ὁ, c, and d, has, as we know, a real root a, and if we 
divide ax? + bx? + cx +d by x — α we obtain a second-degree polynomial 
whose roots are the other roots of ax? + 6x? + cx + d = 0; the roots of this 
second-degree polynomial may be complex numbers. Thus a cubic equation 
will have either three real roots or one real root and 2 complex roots. The — 
existence of the real root is guaranteed by our theorem that every odd degree 
equation has a real root, but it is not really necessary to appeal to this theorem 
(which is of no use at all if the coefficients are complex); in the case of a cubic 
equation we can, with sufficient cleverness, actually find a formula for all the 
roots. The following derivation is presented not only as an interesting illustra- 
tion of the ingenuity of early mathematicians, but as further evidence for the 
importance of complex numbers (whatever they may be). 

To solve the most general cubic equation, it obviously suffices to consider 
only equations of the form 


x® + bx? + cx +d = 0. 


It is even possible to eliminate the term involving x*, by a fairly straight- 
forward manipulation. If we let 


“ 
ἘΠ 
then 
bry ὁ: 
ΠΟ τ coment 
es sea ae 
2by ὁ 
ey, eae ae eel 
δ δ ν᾽ 
so 


0O=x?°+o2%+exr4t+d 


, , by 9 ( 2b%y Ἢ ( ) 
δ Pa aS ak Se epee d 
( PE ea ἢ - ἘΠ 9} 


= + (5-44 c) 66 - τ πο 


The right-hand side now contains no term with y*. If we can solve the equation 
for y we can find x; this shows that it suffices to consider in the first place only 
equations of the form 


x? + px -+ gq = QO. 


In the special case = 0 we obtain the equation x? = —g. We shall see later 
on that every complex number does have a cube root, in fact it has three, so 
that this equation has three solutions. The case p ~ 0, on the other hand, 


436 Infinite Sequences and Infinite Series 


requires quite an ingenious step. Let 


(+) eS (w - 2. #0), 


Then 


Θ- 
Ι 


3 τε Ὁ Bape 
xi pxtq=(w— =) tpl + 4q 
3 3w 


w 


2 2 3 2 
= wt ΟΡ ew Peg 


3w Dw? 27w? 3w 
p? 
= γρ)8 — 
os 27w? 7: 


This equation can be written 
27(w*)*? + 27q(w?) — p*? = 0, 


which is a quadratic equation in w? (!!). 


Thus 
5 27g + V (1) 4" + 4: 276° 

22) 

q q ῥ᾽ 

eee ae eee ae se 

Ze εὖ 27 

Remember that this really means: 

τ 4 q* p* 
we = — *-+ +7, where r is a square root of 4 =e 57 


We can therefore write 
3 a EBB DEES OE ER 
9 4 
w= (=! = Jt a=" 
2 4 27° 
this equation means that w is some cube root of —q/2 + r, where 7 is some 


square root of g?/4 + 3/27. This allows six possibilities for w, but when these 
are substituted into (*), yielding 


it turns out that only 3 different values for x will be obtained! An even more 
surprising feature of this solution arises when we consider a cubic equation 
all of whose roots are real; the formula derived above may still involve com- 
plex numbers in an essential way. For example, the roots of 


x®>— 15x —-4=0 
are 4, —2 + V3, and —2 — V3. On the other hand, the formula derived 


Complex Numbers 437 


above (with p = —15, q = -- 4) gives as one solution 
<<< --15 
Te /2+- ν4-- 125 -- -- --ς,----ΞΞ------- 
3. Ψ2 Ὁ 7/4 -- 125 
ea π-Ξ ς-- 
3: Μ2 - 11 
Now, 
(2+2)§ = 2? 43°2%4+3°2°2% +2 


8+12i-6-i 
=2-+ fli, 


so one of the cube rootsof2 -+ 111152 + ὦ. Thus, for one solution of the equation 
we obtain 


15 
ee oe 
x ἐπ᾿ 
15 6 --- 32 
= 2 , . 
Eh ΡΤ | Bones 
90 — 45; 
a det ils ον 
TPG 
= 4(1), 


The other roots can also be found if the other cube roots of 2 + 111 are 
known. The fact that even one of these real roots is obtained from an expres- 
sion which depends on complex numbers is impressive enough to suggest that 
the use of complex numbers cannot be entirely nonsense. As a matter of fact, 
the formulas for the solutions of the quadratic and cubic equations can be 
interpreted entirely in terms of real numbers. 

Suppose we agree, for the moment, to write all complex numbers as a + δὲ, 
writing the real number a as a + Οἱ and the number 7 as 0 + 12. The laws 
of ordinary arithmetic and the relation 12 = —1 show that 


ath)t+ti(eta) =(@t+o4+64+ a), 
(a + bi): (ec + dt) = (ac — bd) + (ad + be). 
Thus, an equation like 
Q4+2)-G+1)=1+72 

may be regarded as simply an abbreviation for the two equations 

1:..3.- 2: = 1. 

Llp 2 κ᾿. 5 =, 
The solution of the quadratic equation ax? + bx + ¢ = 0 with real coefficients 
could be paraphrased as follows: 


u2 — v2? = b? — 4ac, 
uv = 0, 
(i.e., if (u + vt)? = b? — 4ac), 


If 


438 Infinite Sequences and Infinite Series 


DEFINITION 


ae 


τοῦ - 6 }] [eo 
EEL) 


(4) (E)]+6[-]=0 


(ie. eed (= ἧς ἢ (--:----- 9 iis ja 0). 
2a 2a 


It is not very hard to check this assertion about real numbers without 
writing down a single “‘z,”” but the complications of the statement itself should 
convince you that equations about complex numbers are worthwhile as 
abbreviations for pairs of equations about real numbers. (If you are still not 
convinced, try paraphrasing the solution of the cubic equation.) If we really 
intend to use complex numbers consistently, however, it is going to be neces- 
sary to present some reasonable definition. 

One possibility has been implicit in this whole discussion. All mathematical 
properties of a complex number a + δὲ are determined completely by the real 
numbers a and ὦ; any mathematical object with this same property may 
reasonably be used to define a complex number. The obvious candidate is the 
ordered pair (a, δ) of real numbers; we shall accordingly define a complex 
number to be a pair of real numbers, and likewise define what addition and 
multiplication of complex numbers is to mean. 


A complex number is an ordered pair of real numbers; if z = (a, δ) is a 


complex number, then a is called the real part of z, and ὦ is called the 
imaginary part of z. The set of all complex numbers is denoted by C. If 
(a, ὁ) and (c, d) are two complex numbers we define 


(4,0) (Gd) = (ar be 4) 
(a,b): (c,d) = (a°-c —b:dja:d+b-c). 


(The + and - appearing on the left side are new symbols being defined, 
while the + and - appearing on the right side axe the familiar addition and 
multiplication for real numbers. ) 


When complex numbers were first introduced, it was understood that real 
numbers were, in particular, complex numbers; if our definition is taken 
seriously this is not true—a real number is not a pair of real numbers, after 
all. This difficulty is only a minor annoyance, however. Notice that 


(a, 0) + (4, 0) 
(a, 0) - (ὁ, 0) 


(a+ ὁ,0 - 0) = (a+ ὁ, 0), 
(a‘b—0:0,a:0+0:-5) = (a: ὁ, 0); 


this shows that the complex numbers of the form (a, 0) behave precisely the 
same with respect to addition and multiplication of complex numbers as real 


DEFINITION 


Complex Numbers 439 


numbers do with their own addition and multiplication. For this reason we 
will adopt the convention that (a, 0) will be denoted simply by a. ‘The familiar 
a+ bi notation for complex numbers can now be recovered if one more 
definition is made. 


i = (0, 1). 


Notice that 7? = (0,1): (0,1) = (-—1,0) = —1 (the last equality sign 
depends on our convention). Moreover 
(a,b) = (a, 0) + (0, ὁ) 
= (a, 0) + (6, 0) - (0, 1) 
=a-+ δι. 


You may feel that our definition was merely an elaborate device for defining 
complex numbers as ‘‘expressions of the form a + 47.” That is essentially 
correct; it is a firmly established prejudice of modern mathematics that new 
objects must be defined as something specific, not as “‘expressions.”’ Never- 
theless, it is interesting to note that mathematicians were sincerely worried 
about using complex numbers until the modern definition was proposed. 
Moreover, the precise definition emphasizes one important point. Our aim in 
introducing complex numbers was to avoid the necessity of paraphrasing 
statements about complex numbers in terms of their real and imaginary parts. 
This means that we wish to work with complex numbers in the same way that 
we worked with rational or real nurnbers. For example, the solution of the 
cubic equation required writing x = w — p/3w, so we want to know that 
1/w makes sense. Moreover, w? was found by solving a quadratic equation, 
which requires numerous other algebraic manipulations. In short, we are 
likely to use, at some time or other, any manipulations performed on real 
numbers. We certainly do not want to stop each time and justify every step. 
Fortunately this is not necessary. Since all algebraic manipulations performed 
on real numbers can be justified by the properties listed in Chapter 1, it is only 
necessary to check that these properties are also true for complex numbers. 
In most cases this is quite easy, and these facts will not be listed as formal 
theorems. For example, the proof of P1, 


[(α, δ) +@d+@eN=G@)+16D)+6/I 


requires only the application of the definition of addition for complex num- 
bers. The left side becomes 


(fa +e] te, [6 +d] +f), 
and the right side becomes 


(4 “Ὁ [ς Ἔ ε], ὁ -Ῥ la + f)); 


440 Infinite Sequences and Infinite Series 


these two are equal because P1 is true for real numbers. It is a good idea to 
check P2—P7 and P9. Notice that the complex numbers playing the role of 0 
and 1 in P2 and P6 are (0, 0) and (1, 0), respectively. It is not hard to figure 
out what —(a, δ) is, but the multiplicative inverse for (a, δ) required in P8 is 
a little trickier: if (a, δ) # (0, 0), then a? + 6? # 0 and 


Co (— are sy) (1520), 


This fact could have been guessed in two ways. To find (x, y) with 
(a, δ) " (x,y) = (4, 9) 
it is only necessary to solve the equations 


ax — by = 1, 
bx + ay = 


The solutions are x = a/(a? + b?), y = —b/(a? + 67). It is also possible to 
reason that if 1/(a + 62) means anything, then it should be true that 


1 1 αι τε δι, OE. 


atbi α- ὐὲ a—-b a+? 
Once the existence of inverses has actually been proved (after guessing the 
inverse by some method), it follows that this manipulation is really valid; it is 


the easiest one to remember when the inverse of a complex number is actually 
being sought—it was precisely this trick which we used to evaluate 


15 15. 6 — 3 
6-3, 6.43; 6-5: 
Oe 6.9) 

— 3649 


Unlike P1—-P9, the rules P10-—P12 do not have analogues: it is easy to prove 
that there is no set P of complex numbers such that P10—P12 are satisfied for all 
complex numbers. In fact, if there were, then P would have to contain 1 (since 
1 = 12) and also —1 (since —1 = 2), and tHis would contradict P10. The 
absence of P10—P12 will not have disastrous consequences, but it does mean 
that we cannot define z < w for complex z and w. Also, you may remember 
that for the real numbers, P10—P12 were used to prove that 1 + 1 # 0. For- 
tunately, the corresponding fact for complex numbers can be reduced to this 
one: clearly (1,0) + (1, 0) σέ (0, 0). 

Although we will usually write complex numbers in the form a + 2, it is 
worth remembering that the set of all complex numbers C is just the collection 
of all pairs of real numbers. Long ago this collection was identified with the 
plane, and for this reason the plane is often called the “complex plane. "Τῆς 
horizontal axis, which consists of all points (a, 0) for a in R, is often called the 


DEFINITION 


FIGURE 1 


THEOREM 1 


PROOF 


Complex Numbers 441 


real axis, and the vertical axis is called the zmaginary axis. Two important 
definitions are also related to this geometric picture. 


If z = x + zyisacomplex number (with x and y real), then the conjugate Ζ 
of z is defined as 
Z=x—-y, 


and the absolute value or modulus |z! of z is defined as 
jz) = Vix? + 92. 


(Notice that x? + γξ > 0, so that Vix? + y?is defined unambiguously; it 
denotes the nonnegative real square root of x? -+ y?.) 


Geometrically, Ζ is simply the reflection of z in the real axis, while |z| is the 
distance from z to (0, 0) (Figure 1). Notice that the absolute value notation for 
complex numbers is consistent with that for real numbers. The distance 
between two complex numbers z and w can be defined quite easily as [Ζ — w}. 
The following theorem lists all the important properties of conjugates and 
absolute values. 


Let z and w be complex numbers. Then 
(1) z= Ζ. 
(2) z = zif and only if z is real (1.6., is of the form a + Οἱ, for some real 
number a). 
(3) ztw=7Z4+ Ὁ. 
(4) —z = —(2). 


(8) Σ᾿" το] = [2] "1. 
(9) jz + | « [δ] + [al 


Assertions (1) and (2) are obvious. Equations (3) and (5) may be checked 
by straightforward calculations and (4) and (6) may then be proved by a trick: 


0 - 
oo 


Equations (7) and (8) may also be proved by a straightforward calculation. 
The only difficult part of the theorem is (9). This inequality has, in fact, 
already occurred (Problem 4-8), but the proof will be repeated here, using 
slightly different terminology. 

It is clear that equality holds in (9) if z = O’or w = 0. It is also easy to see 
that (9) is true if z = \w for any real number A (consider separately the cases 


=z+(-2z)=2+-2z, so -z= —(2), 


— 


a a i ee ae a soz! = (2) "Ὁ; 


et] | 


442 Infinite Sequences and Infinite Series 


FIGURE 2 


\ > 0 and d < 0). Suppose, on the other hand, that z # dw for any real 
number -A, and that w ¥ 0. Then, for all real numbers, A, 


(x) O < |z — Awl? = (2 — Aw): (2 — Aw) 

= (z — Aw): (Z — NB) 
zz + λξωῶ — λίωξ + 2D) 
= lw]? + |z|? — A(wz + χα). 


I 


Notice that wZ + zd is real, since 
wit χῷ = ῶξ τ Zw = Hz+ Zw = wi + 2H. 


Thus the right side of (*) is a quadratic equation in A with real coefficients 
and no real solutions; its discriminant must therefore be negative. Thus 


(wz + zi)? — 4lw|?- [232 < 0; 
it follows, since wz + Zw and [ὦ " |z{ are real numbers, and |w|-{z| > 0, that 
(ωξ + 2) < 2|w|- [2]. 
From this inequality it follows that 


et wl? = (e+ 0) (+2) 
\z|? + |w|? + (wz + χῶῷ) 
[215 + [eo]? + 220] "1 
(lel + ἰω])}", 


l A il 


which implies that 
Iz + wl < lz] + lel. 


The operations of addition and multiplication of complex numbers both 
have important geometric interpretations. The picture for addition is very 
simple (Figure 2). Two complex numbers z = (a, 6) andw = (c, 4) determine 
a parallelogram having for two of its sides the line segment from (0, 0) to z, 
and the line segment from (0, 0) to w; the vertex opposite (0, 0) is z-+w 
(a proof of this geometric fact is left to you). 

The interpretation of multiplication is more involved. If z = 0 or w = 0, 
then z+ w = 0 (a one-line computational proof can be given, but even this is 
unnecessary—the assertion has already been shown to follow from P1—P9), so 
we may restrict our attention to nonzero complex numbers. We begin by 
putting every nonzero complex number into a special form. 

For any complex number z # 0 we can write 


oe 
|" 


in this expression, |z| is a positive real number, while 


z= |2| 


Ζ 


[2] 


ees 


ἢ 


FIGURE 3 


angle of 6 radians 


THEOREM 2 


PROOF 


Complex Numbers 443 


so that z/|z| is a complex number of absolute value 1. Now any complex num- 
ber a = x + zy with 1 = ja] = x? + y® can be written in the form 


a = (cos 0, sin 0) = cos 6 +27 sin 6 
for some number @. Thus every nonzero complex number z can be written 
z= r(cos +7 sin @) 


for some r > 0 and some number 9. The number r is unique (it equals |z]), 
but @ is not unique; if 6) is one possibility, then the others are 0) + 2a for ἀ in 
Z—any one of these numbers is called an argument of z. Figure 3 shows z in 
terms of r and 6. (To find an argument 6 for z = x + zy we may note that the 
equation 


z = |2| (cos θ +7 sin 8) 


x $y 
means that 
izlcos 8, 
lz| sin 0. 


I 


If x + 0 we can therefore take θ = arctan y/x, and if x = 0, we can take 
6 = 7/2 when y > 0 and @ = 37/2 when y < 0.) 
Now the product of two nonzero complex numbers 


r(cos 9 +7 sin @), 
s(cos φ +2 sin @), 


a 
w 


I 


15 

rs(cos 6 + 1 sin 6)(cos φ +7 sin φΦ) 

= rs[(cos 6 cos @ — sin θ sin Φ) + 7(sin 8 cos Φ + cos θ sin Φ)] 
= rs(cos (06 + @) +7sin (6 + o)]. 


Thus, the absolute value of a product is the product of the absolute values of 
the factors, while the sum of any argument for each of the factors will be an 
argument for the product. For a nonzero complex number 


N 
& 
I 


z = r(cos θ +7 sin 6) 


it is now an easy matter to prove by induction the following very important 
formula (sometimes known as De Moivre’s Theorem): 


z” = |z|"(cos πθ + 7 sin n@), for any argument θ of z. 


This formula describes z” so explicitly that it is easy to decide just when 
2” = Ww: 

Every nonzero complex number has exactly ἡ complex nth roots. More pre- 
cisely, fer any complex number w γέ 0, and any natural number 2, there are 
precisely n different complex numbers z satisfying z” = w. 


Let 
w = s(cos @ +7 sin ¢) 


444 Infinite Sequences and Infinite Series 


FIGURE 4 


for s = |w| and some number ¢. Then a complex number 
z = r(cos θ +7 sin @) 
satisfies z” = τὸ if and only if 
r"(cos πθ +72 sin n8@) = s(cos φ +7 sin Φ), 
which happens if and only if 


r" = 5, 


cos n+ 7sin πθ = cos Φ +2 sin φ. 


From the first equation it follows that 


where Vs denotes the positive real nth root of s. From the second equation it 
follows that for some integer ἀ we have 


b 2,π 


n n 


6=64 = 


Conversely, if we choose r = Vs and 6 = 6, for some k, then the number 
z = r(cos θ +7 sin 0) will satisfy z” = w. To determine the number of nth 
roots of w, it is therefore only necessary to determine which such z are distinct. 
Now any integer αὶ can be written 


k=ngt+k’ 
for some integer g, and some integer k’ between 0 and n — 1. Then 
cos 6, + ca 6, = cos Oy +2 sin Oy. 
This shows that every z satisfying z” = w can be written 
z= V/s (cos 6, + i sin 6) FS Ot, Ss 


Moreover, it is easy to see that these numbers are all different, since any two 
6, fork -Ξ 0,. .. ,π-- 1 differ by less than 2r. J 


In the course of proving Theorem 2, we have actually developed a method 
for finding the nth roots of a complex number. For example, to find the cube 


roots of i (Figure 4) note that [1 = 1 and that 7/2 is an argument for 7. The 
cube roots of ὦ are therefore 


T 7 

1-1] cos — Y gin — |» 

E aor in =| 
cos Ὁ ὁ sin 


1: | cos (ξ πα τὶ) + 7 sin @ = μ᾿] COs = + 7 sin τ 


— 
Π 
Ω 
ο 
Le] 
, 
oa | 
+ 
INO 
ee) 
oS 
+ 
δε 
+] 
"eo 
“AIR 
+ 
we) 
w ἢ 
SL” 
ἔτ τ -- - 2} 
{| 


Complex Numbers 445 


Since 


6 2 6 
δπ V3 or 
cos— = —— 51ὴ --  - 
2 6 
3 3 
cos — = (, sin — = —1, 


the cube roots of z are 
AP ἢ ATS eG 
2 2 | 


In general, we cannot expect to obtain such simple results. For example, to 
find the cube roots of 2 + 11%, note that |2 + 111| = V2? + 115 = V125and 
that arctan 454 is an argument for 2 + 112. One of the cube roots of 2 + 1121s 
therefore 


8/195 cos (Ξ-: ΞῚ ἥν πὴ (a= ΤῊ 
2 2 


= An rctan +5" 
= V5 cos = Ξ ) + 7 sin (ae 3 Σ}} 
Previously we noted that 2 + 7 is also a cube root of 2 + 111. Since 
2 ἡ = ν΄ 23 +12 = V5, and since arctan ᾧ is an argument of 2 + 7, we 
can write this cube root as 


2+71= ν 5 (cos arctan ᾧ + 7 sin arctan $). 
These two cube roots are actually the same number, because 


arctan 454 
3 


1 
= arctan - 
2 


(you can check this by using the formula in Problem 15-8), but this is hardly 
the sort of thing one might notice! 

The fact that every complex number has an nth root for all n is just a special 
case of a very important theorem. The number / was originally introduced in 
order to provide a solution for the equation x? + 1 = 0. The Fundamental 
Theorem of Algebra states the remarkable fact that this one addition automati- 
cally provides solutions for all other polynomial equations: every equation 


BT Gee Ὁ τ a ee 0 


has a complex root! 
In the next chapter we shall give an almost complete proof of the Funda- 
mental Theorem of Algebra; the slight gap left in the text can be filled in as an 


446 Infinite Sequences and Infinite Series 


exercise (Problem 25-5). The proof of the theorem will rely on several new 
concepts which come up quite naturally in a more thorough investigation of 
complex numbers. 


PROBLEMS 


1. 


*8. 


Find the absolute value and argument of each of the following. 


(i) 3+ 42. 
(ii) (3 4+ 4)-:. 
(iii) (1 + 2) 
(iv) V3 - 41. 
(v) [3 + 4{. 


Solve the following equations. 


Gi) x7 +ux+1 = 0. 
Gi) «ttt? t1 =0. 
(111) x? + 22x — 1 = 0. 
ix — (1 +2)y = 3, 
a Pree ae gin 


(vy) eH χ --κ-- 2 = 0. 


Describe the set of all complex numbers z such that 


4) ΖΞ --Ζ. 
Gi) Z= 2. 
(iii) |z — αἱ = |z — 6]. 


(iv) Jz —al + |z — ὁ} =<. 

(ν) |z| «1 — real part of z. 

Prove that [2] = [2], and that the real part of z is (z + Z)/2, while the 
imaginary part is (z — Z)/2z. 

Prove that |z + wl? + |z — wi? = 2([2|3 + |w|?), and interpret this 
statement geometrically. 

What is the pictorial relation between z and Vi+zV —i? Hint: Which 


line goes into the real axis under multiplication by V —7? 
(a) Prove that if ao, . . . , ἄμ are real and a + δὲ (for a and ὁ real) 
satisfies the equation z* + a,_1z" 1+ +++ + a = 0, thena — δὲ 


also satisfies this equation. (Thus the nonreal roots of such an equa- 
tion always occur in pairs, and the number of such roots is even.) 
(Ὁ) Conclude that 2” + a,12"7! + ++ + + ao is divisible by 2? — 
2az + (a? + 632) (whose coefficients are real). 
(a) Let c be an integer which is not the square of another integer. If a 


and ὁ are integers we define the conjugate of a + ὁ Vc, denoted by 
atov ς, as a -- ὦ Vc. Show that the conjugate is well defined 


*10. 


ἘΠῚ. 


ὙΠ, 


Complex Numbers 447 


by showing that a number can be written a + ὁ Vo, for integers a 
and ὦ, in only one way. 


(Ὁ) Show that for all @ and 6 of the form a + ὁὦ Vo, we have @ = a, 


ἃ = aif and only if @ is an integer, a + β -~a+B, --α = —@, 
a:B=a@:B,anda? = α) lifa 5 0. 

(c) Prove that if ao, . . . , @n_i1 are integersand z=a-+ ὁ Vc satisfies 
the equation 2" + α,. 1201 + +--+ +a) = 0, then? =a -- ὁ Ve 


also satisfies this equation. 
Find all the 4th roots of 2; express the one having smallest argument 
in a form that does not involve any trigonometric functions. 
(a) Prove that if w is an nth root of 1, then so is w*. 
(Ὁ) A number ὦ is called a primitive nth root of 1 if {1,@,w’, ..., 
w"—!} is the set of all nth roots of 1. How many primitive nth roots 
of 1 are there for n = 3, 4, 5, 9? 


π--Ἱ 
(c) Let w be an nth root of 1, with ὦ ~ 1. Prove that wk = 0. 
k=0 
(a) Prove that if z1, ... , 2% lie on one side of some straight line 
through 0, then z; + τῇ 2, #0. Hint: This is obvious from 
the geometric interpretation of addition, but an analytic proof is 
also easy: the assertion is clear if the line is the real axis, and a trick 
will reduce the general case to this one. 


(Ὁ) Show further that z;"', . . . , 2, ' all lie on one side of a straight 
line through 0, so that zy ++ °°: +2, + #0. 
Prove that if |z:] = |z2| = |z3| and 21 + z, + 23 = 0, then 21, 2, and Ζ3 


are the vertices of an equilateral triangle. Hint: It will help to assume 
that Ζι is real, and this can be done with no loss of generality. Why? 


CHAPTER 


COMPLEX FUNCTIONS 


You will probably not be surprised to learn that a deeper investigation of 
complex numbers depends on the notion of functions. Until now a function 
was (intuitively) a rule which assigned real numbers to certain other real 
numbers. But there is no reason why this concept should not be extended; we 
might just as well consider a rule which assigns complex numbers to certain 
other complex numbers. A rigorous definition presents no problems (we will 
not even accord it the full honors of a formal definition): a function is a collec- 
tion of pairs of complex numbers which does not contain two distinct pairs 
with the same first element. Since we consider real numbers to be certain com- 
plex numbers, the old definition is really a special case of the new one. Never- 
theless, we will sometimes resort to special terminology in order to clarify 
the context in which a function is being considered. A function / is called 
real-valued if f(z) is a real number for all z in the domain of f, and complex- 
valued to emphasize that it is not necessarily real-valued. Similarly, we will 
usually state explicitly that a function f is defined on [a subset of] R in those 
cases where the domain of f is [a subset of] R; in other cases we sometimes 
mention that f is defined on [a subset of] C to emphasize that f(z) is defined 
for complex z as well as real z. 

Among the multitude of functions defined on C, certain ones are particu- 
larly important. Foremost among these are the functions of the form 


72. One aye ἐς ea 


where do, . . . , ἅμ are complex numbers. These functions are called, as in 
the real case, polynomial functions; they include the function f(z) = z (the 
‘identity function’’) and functions of the form f(z) = @ for some complex 
number a (‘‘constant functions’). Another important generalization of a 
familiar function is the “‘absolute value function” f(z) = |z| for all z in C. 
Two functions of particular importance for complex numbers are Re (the 
“real part function’’) and Im (the “imaginary part function”’), defined by 


Re(x + zy) = x, 

Im(x + ty) = 9; 
The ‘‘conjugate function” 15 defined by 
f(z) = Z = Re(z) — πὰ (2). 


Familiar real-valued functions defined on R may be combined in many 
ways to produce new complex-valued functions defined on C—an example 
is the function 


for x and y real. 


Ι 


S(x + ὃ) = & sin(x — y) + 1χ3 cos γ. 
448 


Complex Functions 449 


The formula for this particular function illustrates a decomposition which is 
always possible. Any complex-valued function f can be written in the form 


f=utw 


for some real-valued functions u and v—simply define u(z) as the real part of 
f(z), and v(z) as the imaginary part. This decomposition is often very useful, 
but not always; for example, it would be inconvenient to describe a poly- 
nomial function in this way. 

One other function will play an important role in this chapter. Recall that 
an argument of a nonzero complex number 2 18 a (real) number @ such that 


z = |z|(cos θ +7 sin 8). 


There are infinitely many arguments for z, but just one which satisfies 
0 < 6 < 2m. If we call this unique argument 6(z), then @ is a (real-valued) 
function (the “argument function”) on {z in C: z # 0}. 

“Graphs” of complex-valued functions defined on G, since they lie in 
4-dimensional space, are presumably not very useful for visualization. ‘The 
alternative picture of a function mentioned in Chapter 4 can be used instead: 
we draw two copies of C, and arrows from z in one copy, to f(z) in the other 


(Figure 1). 


εν “---οὐ-ὐὶνοο 
2 πων EE 
he 


FIGURE 1 


The most common pictorial representation of a complex-valued function is 
produced by labeling a point in the plane with the value f(z), instead of with z 
(which can be estimated from the position of the point in the picture). Figure 
2 shows this sort of picture for several different functions. Certain features of 
the function are illustrated very clearly by such a “graph.” For example, the 
absolute value function is constant on concentric circles around 0, the func- 
tions Re and Im are constant on the vertical and horizontal lines, respectively, 
and the function f(z) = z? wraps the circle of radius r twice around the circle 
of radius r’. 

Despite the problems involved in visualizing complex-valued functions in 
general, it is still possible to define analogues of important properties pre- 
viously defined for real-valued functions on R, and in some cases these 
properties may be easier to visualize in the complex case. For example, the 
notion of limit can be defined as follows: 


lim f(z) = / means that for every (real) number ¢ > 0 there is a (real) 


number 6 > 0 such that, for all z, if0 < |z — αἱ < δ, then |f(z) — /| < ε. 


min mia mln min ming min mics MIN ἰὼ, ming mln mila min mies 


ἐξεςος:τεςεςει 


See Oo τι τῷ 


τ τ᾽ τ τ τ τ τί τ' τ 


πα πὶ RIN πὰ πί [RN πὰ πὰ RIN RIN πὸ πὰ min len 


ἐφε:φεςες ες εςς 


oo oc co Co CO OO OU oO 5. Ὁ 


o_o 


“IN IN min Sal ee] 120 πον Cd on =jn ls IN IN fet τ] an 


mm et τ τ 
1» “9,1 SIS θ᾽ ἡ ἘΠ Ὁ ἢ SSR, Sails ES “af 


πα ee ee δ αν 


MN mia min MIN MAN MIN MIN MN MIN Mi Mis Mee MIN τ] 


a Sa | ea ee ee | 


450 Infinite Sequences and Infinite Series 


| 


~1—1—1—1~1-1~—1—1-1-1-1-1-1 


RIN Rie k|N RIN RIN 


= Im(z) 


(c) f(x) 


(e) f(z) = 6(z) 


FIGURE 2 


xr TT WT WT WT π 


THEOREM 1 


PROOF 


Complex Functions 451 


Although the definition reads precisely as before, the interpretation is slightly 
different. Since |z — οὐ] is the distance between the complex numbers z and w, 
the equation lim f(z) = / means that the values of f(z) can be made to lie 


inside any given circle around /, provided that z is restricted to lie inside a 
sufficiently small circle around a. This assertion is particularly easy to visualize 
using the ‘“‘two copy”’ picture of a function (Figure 3). 


FIGURE 3 


Certain facts about limits can be proved exactly as in the real case. In 
particular, 


lime = ὦ, 
za 
lim z = a, 


24 


lima f(z) + g(z)] = lim Je) a lim g(z), 
lim f(z): gf) = lim JZ): lim g(z), 
1 1 


im —— = ————~_ if lim g(z) σέ 0. 
274 g(z) lim g(z) za 


The essential property of absolute values upon which these results are based 
is the inequality |z + w| < |z| + |w|, and this inequality holds for complex 
numbers as well as for real numbers. These facts already provide quite a few 
limits, but many more can be obtained from the following theorem. 


Let f(z) = u(z) + zv(z) for real-valued functions w and 2, and let? = a+ 28 
for real numbers ἃ and β. Then lim f(z) = / if and only if 


Suppose first that lim f(z) = /. If € > 0, there is 6 > 0 such that, for all z, 


if 0 < |z -- αἱ < ὃ, then |f(z) -dj « ε. 
The second inequality can be written 


[u(z) — αἱ + i[o(z) — Bll < ε, 


452 Infinite Sequences and Infinite Series 


or 


[u(z) — a]? + [o(z) — BI)? < @?. 


Since u(z) — a and v(z) — βὶ are both real numbers, their squares are positive; 
this inequality therefore implies that 


[u(z) — a]? < 6? and [υ(2) — Bl]? < ςξ, 
which implies that 
lu(z) —a| “ες and [υ(2) — Bi < ξ. 
Since this is true for all ¢ > 0, it follows that 


lim u(z) = a@ and lim o(z) = β. 


Za 


Now suppose that these two equations hold. If ¢ > 0, there isa 6 > 0 such 
that, for all z, if 0 « μὲ — αἱ < ὃ, then 


Ξ ξ 
[μ(2) — αἱ ee and |v(z) — al < δ 


which implies that 


f(z) — d| = |[u(z) — a] + 12}. 8] 
= lu) =a--b 2} lee) 6 
ξ ξ 
< 5 ΞΕ 5 = ξ. 


This proves that lim f(z) = /. J 


In order to apply Theorem 1 fruitfully, notice that since we already know 
the limit lim z = a, we can conclude that 


27a 


lim Re(z) = Re(a), 


lim Im(z) = Im(a). 
A limit like 
lim sin(Re(z)) = sin(Re(a)) 


follows easily, using continuity of sin. Many applications of these principles 
prove such limits as the following: 
lim Ζ = ἃ, 
Za 
lim [2] = |al, 
2a 
lim =e” sin x + 7x* cos y = δ᾽ sin a + 2a’ cos ὁ. 
(α -ἰ ἐψ)--  α-ἰ bi 
Now that the notion of limit has been extended to complex functions, the 
notion of continuity can also be extended: f is continuous at a if lim f(z) = 


za 


Complex Functions 453 


f(a), and f is continuous if f is continuous at a for all a in the domain of f. The 
previous work on limits shows that all the following functions are continuous: 


(2) Ξ Gage κα Se GG 
f(z) = 2, 
f(z) = lel, 

f(x + ty) = δ᾽ sin x + ix* cos y. 


Examples of discontinuous functions are easy to produce, and certain ones 
come up very naturally. One particularly frustrating example is the “‘argu- 
ment function”? 8, which is discontinuous at all nonnegative real numbers 
(see the “‘graph” in Figure 2). By suitably redefining θ it is possible to change 
the discontinuities; for example (Figure 4), if 0’(z) denotes the unique argu- 
ment of z with 7/2 < 6’(z) < 57/2, then 6’ is discontinuous at αἱ for every 
nonnegative real number a. But, no matter how θ is redefined, some discon- 
tinuities will always occur. 


°2,40 
τ 
2 
T "2-ἴεπ .28π 
2 
Τ oo 1 .-.2 8 
νὰ "2. 
τ e 
2 “2 π om .21π 
i 230 .21η 
- e2in 
5 .28π .2}π 
τ 
2 ἃ, "5 
“|} “2ὲν 
2 “21π 


f(z) = 62) 


FIGURE 4 


The discontinuity of θ has important bearing on the problem of defining a 
‘‘square-root function,” that is, a function f such that (f(z))? = z forall z. For 
real numbers the function ~/ had as domain only the nonnegative real num- 
bers. If complex numbers are allowed, then every number has two square roots 
(except 0, which has only one). Although this situation may seem better, it is 
in some ways worse; since the square roots of z are complex numbers, there 1s 
no clear criterion for selecting one root to be f(z), in preference to the other. 


454 Infimte Sequences and Infinite Series 


FIGURE 5. 


One way to define f is the following. We set (0) = 0, and for z κέ 0 we set 
a 6 
f(z) = νι (cos “2 +7 sin Ὁ) 


Clearly (f(z))* = z, but the function f is discontinuous, since @ is discontinu- 
ous. As a matter of fact, it is impossible to find a continuous f such that 
(f(z))? = Ζ for all z. Although you may be willing to accept this assertion 
without any fuss, the argument, given below, is rather tricky. 

We will actually show that a continuous square root function f/ cannot even 
be defined on the set of all z with |z| = 1. Suppose there were such a func- 
tion f. We must have f(1) = 1 or f(1) = —1, and we can therefore assume 
that f(1) = —1, since —f would also be a continuous square root function. 
In this case, since f does not have the value 1 at 1, it never takes on the value 1. 
This means that if we define g by 


g(z) = O(f(z)) for [2] = 1, 


then g will be continuous, since θ is continuous at all w with |w| = 1 except for 
w = 1, which is not f(z) for any z. But the equation 
(f(2))? = 2 
implies that 
A(z) = 20(f(z)) = 2g(2), 
which would imply that θ is continuous at z for all z with |z| = 1. Since this is 


false, no such function f can exist. Similar arguments show that it is impossible 
to define continuous “‘nth-root functions” for any n > 2. 

For continuous complex functions there are important analogues of certain 
theorems which describe the behavior of real-valued functions on closed inter- 
vals. A natural analogue of the interval [a, ὁ] is the set of all complex numbers 
z=x+ty with a<x<6andc< y<d (Figure 5). This set is called a 
closed rectangle, and is denoted by ἴα, ὁ] X [c, 41. 

If fis a continuous complex-valued function whose domain is [a, 6] X [c, d], 
then it seems reasonable, and is indeed true, that f is bounded on [a, ὁ] X [c, d]. 
That is, there is some real number M such that 


|f(z)| < M for all z in [a, δ) X [c, d]. 


It does not make sense to say that f has a maximum and a minimum value on 
fa, 6] X [c, d], since there is no notion of order for complex numbers. If fis a 
real-valued function, however, then this assertion does make sense, and is true. 
In particular, if fis any complex-valued continuous function on [a, ὁ] X [¢, 4], 
then |/| is also continuous, so there is some Ζο in [a, δ) X [c, d] such that 


|f(z0)| < |f(z)| for all z in [a, δ] X [¢, ἀ); 


a similar statement is true with the inequality reversed. It is sometimes said 
that “‘f attains its maximum and minimum modulus on [a, δ] X [e, d].” 
The various facts listed in the previous paragraph will not be proved here, 


THEOREM 2 (THE FUNDAMENTAL 
THEOREM OF ALGEBRA) 


PROOF 


455 


Complex Functions 


although proofs are outlined in Problem 5. Assuming these facts, however, we 
can now give a proof of the Fundamental Theorem of Algebra, which is 
really quite surprising, since we have not yet said much to distinguish poly- 
nomial functions from other continuous functions. 


Let ao, . . . , @,—1 be any complex numbers. Then there is a complex number 
Ζ such that 

ZS Ont ἘΡ ayes FE et ag 0. 
Let 


fz) = 2™ 4 an ἀκλτεθ θῶ Hag. 
Then ἢ, is continuous, and so is the function |/| defined by 
Fie) Sf) = 2? a aaa ae & ia, 


Our proof is based on the observation that a point z» with f(z) = 0 would 
clearly be a minimum point for |/f|. To prove the theorem we will first show 
that | f| does indeed have a smallest value on the whole complex plane. The proof 
will be almost identical to the proof, in Chapter 7, that a polynomial function 
of even degree (with real coefficients) has a smallest value on all of R; both 
proofs depend on the fact that if |z| is large, then | f(z)| is large. 

We begin by writing, for z ¥ 0, 


f(z) -2(1 Seen Ae ae 2 +), 
a 


δ 
so that 
fe = ter 1 tg 4 9) 
| Ζ τὰ 
Let 
M = max(1, 2nlan—i|, . . . , 2njaol). 


Then for all z with |z| > M4, we have |z*| > |z| and 


ΓΕ lan—x| lan—z| ΕΗ a 
jae ah alg |) 2η 
50 
ἄπ-- aoe εὖ « ἀπ +: ἮΝ = < Ὁ 
Ζ a Zz oF 2 
which implies that 
| 1 
Θ᾿. »1 πώς ΕΣ πω cee 
| ας Ζ a 2 
This means that 
|f(z)| = lat” for |z| > M. 


2 


456 Infinite Sequences and Infinite Series 


In particular, if |z| > 4 and also |z| > V/2| f(0)|, then 
If(z)| = [F(0)]. 


Now let [a, 6] X [c, 4] be a closed rectangle (Figure 6) which contains 
{z: [2] < max(M, V 2\ f(0)|)}, and suppose that the minimum of |/| on 
(a, 6] X [c, d] is attained at zo, so that 


(1) |f(zo)| < |f(2)| for z in [a, 6] X [e, 41. 


It follows, in particular, that |f(z0)| < |f(0)|. Thus 


(2) if [δ] => max(M, V2/f(0)|), then |/(z)| = [ΧΧ0}} = [f(0)|. 


Combining (1) and (2) we see that |f(zo)| < |f(z)| for all z, so that |/| attains 
its minimum value on the whole complex plane at Zo. 


FIGURE 6 


To complete the proof of the theorem we now show that f(zo) = 0. It is 
convenient to introduce the function g defined by 


g(z) = flz + 20). 


Then g is a polynomial function of degree n, whose minimum absolute value 


occurs at 0. We want to show that g(0) = 0. 
Suppose instead that g(0) = ἃ κέ 0. If mis the smallest positive power of z 
which occurs in the expression for g, we can write 


g(z) = at B2™ + cme" +t + ene”, 


where 8 ~ 0. Now, according to Theorem 24-2 there is a complex number Ύ 
such that 


Complex Functions 457 


Then, setting d, = czy", we have 
lg(vz)| = la + By™2” + dmyiz™tt + oo + dyz™| 


= Ια —az™ + diz" δ τὸν | 


= α(ι - 2m 4 SH ts 4 sie ἢ 
a 


= [a(t ang on [SH a ἢ) 
a 


= [alfa pm [See 4 os |} 
a 


This expression, so torturously arrived at, will enable us to reach a quick 
contradiction. Notice first that if |z| is chosen small enough, we will have 
dm ae ane 


<1, 


If we choose, from among all z for which this inequality holds, some z which 
is real and positive, then 


zn | Set Ὁ ne |< Zu ee. 


Consequently, if 0 < z < 1 we have 


«]11 -- 2 + em | Hs + ἢ ὁ || 
a 

zn | ts + δ ἢ 
6 4 


a > 2" =e" 
a ΤΣ 


1 — am am |e me | 


Qa 


ee akin 


This is the desired contradiction: for such a number z we have 


le(vz)| < lal, 


contradicting the fact that |a| is the minimum of |g| on the whole plane. 
Hence, the original assumption must be incorrect, and g(0) = 0. This implies, 
finally, that f(zo) = 0. 


Even taking into account our omission of the proofs for the basic facts about 
continuous complex functions, this proof verified a deep fact with surprisingly 
little work. It is only natural to hope that other interesting developments will 
arise if we pursue further the analogues of properties of real functions. The 
next obvious step is to define derivatives: a function / is differentiable at a if 


him £02 + 2) -- f(a) 


270 Ζ 


exists, 


458 Infinite Sequences and Infinite Series 


in which case the limit is denoted by f’(a). It is easy to prove that 


fila) =0 iff) Ξ ε, 
f(a) =1 if f(z) = 2, 
(f + g)'l@) = fila) + g'(a) 
oe) a= a + flajg'(a), 
1\/ —a(a) . 
( (a) feta)? if g(a) ¥ 0, 
(feg)'(a) = fi(gla)) " ε΄ (α); 
the proofs of all these formulas are exactly the same as before. It follows, in 
particular, that if f(z) = 2”, then f’(z) = nz”. These formulas only prove 
the differentiability of rational functions however. Many other obvious 
candidates are not differentiable. Suppose, for example, that 


f(x+y)=x—y (Le, f(z) = 2). 
If f is to be differentiable at 0, then the limit 


i eT, i 
(ztiy)0 Say (ztiy)70 x + 1} 


must exist. Notice however, that 


0, then ~—” = 1, 


ify = 
a ay 
and 
if x = 0, then ~—_-2 = --Ἰ; 
x + ty 


therefore this limit cannot possibly exist, since the quotient has both the values 
1 and —1 for x + zy arbitrarily close to 0. 

In view of this example, it is not at all clear where other differentiable func- 
tions are to come from. If you recall the definitions of sin and exp, you will see 
that there is no hope at all of generalizing these definitions to complex num- 
bers. At the moment the outlook is bleak, but all our problems will soon be 
solved. 


PROBLEMS 


1. (a) For any real number y, define a(x) = x + zy (so that @ is a complex- 
valued function defined on R). Show that α is continuous. (This 
follows immediately from a theorem in this chapter.) Show similarly 
that B(y) = x + zy is continuous. 

(b) Let f be a continuous function defined on C. For fixed y, let g(x) = 
f(x + ty). Show that g is a continuous function (defined on R). Show 
similarly that A(y) = f(x + iy) is continuous. Hint: Use part (a). 

2. (a) Suppose that f is a continuous real-valued function defined on a 

closed rectangle [a, ὁ] X [c, d]. Prove that if f takes on the values f(z) 


Complex Functions 459 


and f(w) for z and w in [a, δ] X [c, 4], then f also takes all values 
between f(z) and f(w). Hint: Consider g(t) = f(tz + (1 — ἢ) fort 
in [0, 1]. 

*(b) If f is a continuous complex-valued function defined on ἴα, ὁ] X 
[c, d], the assertion in part (a) no longer makes any sense, since we 
cannot talk of complex numbers between f(z) and f(w). We might 
conjecture that f takes on all values on the line segment between f(z) 
and f(w), but even this is false. Find an example which shows this. 

3. (a) Prove that ifao, . . . , @,_1 are any complex numbers, then there are 
complex numbers 21, - - - , Zn (not necessarily distinct) such that 


2>ta, ie t+: ++ +a = [| ( -- x). 
t=1 
(b) Prove that if ao, . . . , @n—1 are real, then 2” + a,12" 1} + °° ° + ao 
can be written as a product of linear factors z + a and quadratic 
factors z? + az + ὁ all of whose coefficients are real. (Use Problem 
24-7.) 
4, In this problem we will consider only polynomials with real coefficients. 
Such a polynomial is called a sum of squares if it can be written as 
hy? +--+ -Ἑ Δ,3 for polynomials h; with real coefficients. 


(a) Prove that if f is a sum of squares, then f(x) > 0 for all x. 

(b) Prove that if f and g are sums of squares, then so is f° g. 

(c) Suppose that f(x) > 0 for all x. Show that f is a sum of squares. Hint: 
First write f(x) = x*g(x), where g(x) τέ 0 for all x. Then 4 must be 
even (why?), and g(x) > 0 for all x. Now use Problem 3(b). 


5. (a) Let A bea set of complex numbers. A number Ζ is called, as in the real 
case, a limit point of the set A if for every (real) ¢ > 0, there is a 
Se point a in A with |z — αἱ < ¢ but z # a. Prove the two-dimensional . 
version of the Bolzano-Weierstrass Theorem: If A is an infinite subset 
of [a, 6] X [c, d], then A has a limit point in [a, 6] X [c, 4]. Hint: 
First divide [a, 6] X [c, 4] im half by a vertical line as in Figure 7(a). 
Since A is infinite, at least one half contains infinitely many points of 
A. Divide this in half by a horizontal line, as in Figure 7(b). Continue 
(b) in this way, alternately dividing by vertical and horizontal lines. 


FIGURE 7 ere ee eee 
(The two-dimensional bisection argument outlined in this hint is so 


standard that the title ‘““Bolzano-Weierstrass’”’ often serves to describe 
the method of proof, in addition to the theorem itself. See, for example, 
H. Petard, ‘A Contribution to the Mathematical Theory of Big Game 
Hunting,” Amer. Math. Monthly, 45 (1938), 446-447.) 

(b) Prove that a continuous (complex-valued) function on [a, ὁ] X [c, d] 
is bounded on [a, ὁ] X [c, d]. (Imitate Problem 21-22.) 

(c) Prove that if f is a real-valued continuous function on [a, δ] X [e, d\, 
then f takes on a maximum and minimum value on [a, ὁ] X [c, 4]. 
(You can use the same trick that works for Theorem 7-3.) 


460 Infinite Sequences and Infinite Series 


(a) a convex subsct of the plane 


(b) a nonconvex subset of the plane 


FIGURE 8 


*6, 


The proof of Theorem 2 cannot be considered to be completely ele- 
mentary because the possibility of choosing Ὕ with y” = —a/@ depends 
on Theorem 24-2, and thus on the trigonometric functions. It is therefore 
of some interest to provide an elementary proof that there is a solution for 
the equation ζῇ —c = 0. 


(a) Make an explicit computation to show that solutions of ζῇ — ¢ = 0 
can be found for any complex number c. 

(b) Explain why the solution of z” — c = 0 can be reduced to the case 
where n is odd. 

(c) Let zo be the point where the function f(z) = 2” — chasits minimum 
absolute value. If z) ¥ 0, show that the integer m in the proof of 
Theorem 2 is equal to 1; since we can certainly find y with γ᾽ = 
—a/8, the remainder of the proof works for f. It therefore suffices to 
show that the minimum absolute value of f does not occur at 0. 

(4) Suppose instead that f has its minimum absolute value at 0. Since n is 
odd, the points +6, +62 go under f into —c + 6", —c + 6". Show 
that for small 6 at least one of these points has smaller absolute value 
than —c, thereby obtaining a contradiction. 


Let 705}. GH 2)" (Ζιξει 
k 
(a) Show that f(z) = (z—2)™ - - - (eg —2)™- > m,(z— 2,). 1. 
a=] 
k 
(b) Let g(z) = Σ m,(zZ — 2,) ". Show that 1 σ( 2) = 0,thenz1, . . . , 2 
a=1 


cannot all lie on the same side of a straight line through z. Hint: Use 
Problem 24-11. 

(c) Asubset Καὶ of the plane is convex if Καὶ contains the line segment join- 
ing any two points in it (Figure 8). For any set A, there is a smallest 
convex set containing it, which is called the convex hull of A (Figure 
9); if a point P is not in the convex hull of A, then all of A is contained 


FIGURE 9 


Complex Functions 461 


on one side of some straight line through P. Using this information, 
prove that the roots of f’(z) = 0 lie within the convex hull of the set 
{z1,..., 2}. (The definition of convexity in the Appendix to 
Chapter 11 is related to the one given here—a function f is convex 
if the set of points lying above or on the graph of f is a convex set, and 
if the graph contains no straight line segments. Further details about 
convex sets will be found in reference [19] of the Suggested Reading.) 


8. Prove that if f is differentiable at z, then f is continuous at z. 
*9, Suppose that f = wu + 2 where u and Ὁ are real-valued functions. 


10. 


(a) For fixed yo let g(x) = u(x + tyo) and h(x) = v(x + 250). Show that if 
f'(xo + iyo) = @ + if for real α and 8, then g’(%)= aandh'(xo) = β. 

(b) On the other hand, suppose that k(y) = u(xo + zy) and U(y) = 
v(xo + zy). Show that /'(yo) = @ and k'(yo) = —8. 

(c) Suppose that f’(z) = 0 for all z. Show that / is a constant function. 


(a) Using the expression 


1 1 1 1 
fe) = a =a ap 
find f(x) for all k. 
(b) Use this result to find arctan™ (0) for all ἢ. 


CHAPTER 26 COMPLEX POWER SERIES 


FIGURE 2 


THEOREM t 


PROOF 


If you have not already guessed where differentiable complex functions are 
going to come from, the title of this chapter should give the secret away: we 
intend to define functions by means of infinite series. This will necessitate a 
discussion of infinite sequences of complex numbers, and sums of such 
sequences, but (as was the case with limits and continuity) the basic definitions 
are almost exactly the same as for real sequences and series. 

An infinite sequence of complex numbers is, formally, a complex-valued 
function whose domain is N; the convenient subscript notation for sequences 
of real numbers will also be used for sequences of complex numbers. A 
sequence {a,} of complex numbers is most conveniently pictured by labeling 
the points a, in the plane (Figure 1). 

The sequence shown in Figure 1 converges to 0, ‘‘convergence”’ of complex 
sequences being defined precisely as for real sequences: the sequence {an} 
converges to /, in symbols 

lim a, = ἐ, 
n-> ὦ 
if for every € > Ὁ there is a natural number WN such that, for all πη, 
fn >, then jay — 1) « ε. 


This condition means that any circle drawn around / will contain a» for all 
sufficiently large n (Figure 2); expressed more colloquially, the sequence is 
eventually inside any circle drawn around /. 

Convergence of complex sequences is not only defined precisely for real 
sequences, but can even be reduced to this familiar case. 


Let 
Qn = by, tic, for real 6, and cz, 
and let | 
ἰ τ β - ΤΎ for real β and y. 


Then lim a, = / if and only if 


lim 6, = 8 and limc, = Ύ. 


γ1-- 0 n— οο 


The proof is left as an easy exercise. If there is any doubt as to how to proceed, 
consult the similar Theorem 1 of Chapter 25. J 


The sum of a sequence {a,} is defined, once again, as lim s,, where 


γὺ © 


Sn = Qt 5 #an. 
462 


THEOREM 2 


PROOF 


THEOREM 3 


PROOF 


Complex Power Series 463 


Sequences for which this limit exists are summable; alternatively, we may 


oe 


say that the infinite series ᾿ dn converges if this limit exists, and diverges 
n=1 

otherwise. It is unnecessary to develop any new tests for convergence of 

infinite series, because of the following theorem. 


Let 
Qn = b, tic, for real 6, and Cy. 
Then ) ad, converges if and only if Σ b,, and > C, both converge, and in 
n=) n=l n=l] 
this case 


eo οο 


You= ΣΝ ΕΙΣ ὦ) 


= r= 


This is an immediate consequence of Theorem 1 applied to the sequence of 
partial sums of {a,}. ff 


There is also a notion of absolute convergence for complex series: the series 


οο 


> a, converges absolutely if the series > la,| converges (this is a series of 
n=1 n=1 


real numbers, and consequently one to which our earlier tests may be applied). 
The following theorem is not quite so easy as the preceding two. 


Let 
Qn = ὦ, tic, for real ὁ, and Cp. 


Then » dn converges absolutely if and only if > 6, and » ¢, both con- 
n=1 n=l n=1 


verge absolutely. 


Suppose first that Σ b, and > cn, both converge absolutely, i.e., that 


n=1 n=l 
foo] 


> δ, and Ὶ lcn| both converge. It follows that " lbn| + |cn| converges. 
n=l n=1 n=l 


Now, 
lan| = [bn + ἴσῳ] < [bal + cae 


It follows from the comparison test that > [α,[ converges (the numbers |an| 


n=l 
οο 


and |5,| + |cn| are real and nonnegative). Thus > dn converges absolutely. 


n=1 


464 Infinite Sequences and Infinite Series 


THEOREM 4 


οο 


Now suppose that Σ |an| converges. Since 


n=1 
lan| = Vb,’ + Cn”, 
it is clear that 
[On| S lan] and [en| < |anl. 


Once again, the comparison test shows that ) |b,| and » lcn| converge. ἢ 
n=1 n=l 


Two consequences of Theorem 3 are particularly noteworthy. If an 


n=1 
οο οο 


converges absolutely, then ὁ, and > Cn also converge absolutely; conse- 


n=1 n=l 
φο 


quently > 6, and ¢, converge, by Theorem 22-4, so ) a, converges by 
Theorem 2. In other words, absolute convergence implies convergence. 
Similar reasoning shows that any rearrangement of an absolutely convergent 
series has the same sum. These facts can also be proved directly, without 
using the corresponding theorems for real numbers, by first establishing an 
analogue of the Cauchy criterion (see Problem 12). 

With these preliminaries safely disposed of, we can now consider complex 
power series, that is, functions of the form 

F(z) = Σ αᾳίξ -- a)” = ay + ai(z -- a) + az —a)?+ δε 
n=0 

Here the numbers a and a, are allowed to be complex, and we are naturally 
interested in the behavior of f for complex z. As in the real case, we shall 
usually consider power series centered at 0, 


f(z) = > Anz"; 


n=0 


in this case, if f{(zo) converges, then f(z) will also converge for |z| < |Zo|. The 
proof of this fact will be similar to the proof of Theorem 23-6, but, for reasons 
that will soon become clear, we will not use all the paraphernalia of uniform 
convergence and the Weierstrass M-test, even though they have complex ᾿ 
analogues. Our next theorem consequently generalizes only a small part of 
Theorem 23-6. 


Suppose that 
> AnZ0” = ao + azo + a2z07 $F °°: 
n=0 


converges for some 2) ¥ 0. Then if |z| < |zo|, the two series 


FIGURE 3 


PROOF 


Complex Power Series 465 


8 


Qnz"” =agtaztazt+::: 


Θ 


2 Ὁ ΞΞ αι + 2022 + 3422 + i a 


3 3 
uh 8 4 
3 
S 


κι 


both converge absolutely. 
As in the proof of Theorem 23-6, we will need only the fact that the set of 
numbers ἀμζοῖ is bounded: there is a number M such that 


lanZ0"| < M for all n. 
We then have 


lanz"| = lanzo"| * | = 
“0 
nr 
«Μμ|Ξ3}: 
Ζ0 
and, for z ᾽έ 0, 
1 vi) 
Nanz” ἢ = —nlanzo"| = 
\2| 20 
M aa Wig 
| 
|z| zZ0 


Since the series Σ |z/zo|" and Σ n|z/zo\" converge, this shows that both 
1 


n=0 n= 
00 io] οο 


) a,z” and Σ nanz” 1 converge absolutely (the argument for Σ Nanz” 
n=1 


1 


n=0 n=l 


assumed that z σέ 0, but this series certainly converges for z = 0 also). Jj 


Theorem 4 evidently restricts greatly the possibilities for the set 
{z: > anz” converges}. 
n=0 


For example, the shaded set A in Figure 3 cannot be the set of all z where 


dnZz" converges, since it contains z, but not the number τὸ satisfying 
n=0 
jo] < [δ]. 

It seems quite unlikely that the set of points where a power series converges 
could be anything except the set of points inside a circle. If we allow “circles 
of radius 0” (when the power series converges only at 0) and “‘circles of 
radius ©”? (when the power series converges at all points), then this assertion 
is true (with one complication which we will soon mention); the proof requires 
only Theorem 4 and a knack for good organization. 


466 Infinite Sequences and Infinite Series 


THEOREM 5 


PROOF 


For any power series 

ane” = ao + are + az” + age? στ΄ 
n= 

one of the following three possibilities must be true: 


() ») anz” converges only for z = Ὁ. 
=0 


r= 


(2) a,z” converges absolutely for all z in Ὁ. 
=0 


n= 
i) 


(3) There is a number R > 0 such that anz”" converges absolutely if 
5 
n=0 
|z| < Rand diverges if |z| > R. (Notice that we do not mention what 
happens when [2] = 2.) 


Let 


S = {xin R: > a,w” converges for some w with |w| = x}. 
n=0 


Suppose first that S' is unbounded. Then for any complex number 2, there 


is a number x in S such that |z| < x. By definition of S, this means that > Anw” 


n=Q 


converges for some w with |w| = x > [z|. It follows from Theorem 4 that 


᾿ anz” converges absolutely. Thus, in this case possibility (2) is true. 
n=0 


Now suppose that § is bounded, and let R be the least upper bound of S. If 


R = 0, then > an2” converges only for z = 0, so possibility (1) is true. Sup- 
n=O 

pose, on the other hand, that R > 0. Then if z is a complex number with 

|z| < R, there is a number x in S with |z| < x. Once again, this means that 


» anw™" converges for some w with |z| < [ὦ], so that Σ Anz” converges 


n=0 n=0 
20 


absolutely. Moreover, if |z| > R, then > anz" does not converge, since |z| is 
n=0 


not in S. J 


The number R which occurs in case (3) is called the radius of convergence 


be} 


of > anz”". In cases (1) and (2) it is customary to say that the radius of con- 
n=0 


vergence is 0 and οὐ, respectively. When 0 « R « οὐ, the circle {z: |z| = R} 


is called the circle of convergence of > a,z”. If z is outside the circle, then, 


n=Q 


the terms a,z" are not bounded 


FIGURE 4 


circle of convergence 


THEOREM 6 


Complex Power Series 467 


of course, ) a,2z” does not converge, but actually a much stronger statement 
n=0 

can be made: the terms a,z” are not even bounded. To prove this, let w be 

any number with |z| > |w| > R; if the terms a,z” were bounded, then the 


proof of Theorem 4 would show that ) a,w” converges, which is false. ‘Thus 


n=0 
οο 


(Figure 4), inside the circle of convergence the series > a,z” converges in the 
n=0 

best possible way (absolutely) and outside the circle the series diverges in the 

worst possible way (the terms a,z” are not bounded). 

What happens on the circle of convergence is a much more difficult question. 
We will not consider that question at all, except to mention that there are 
power series which converge everywhere on the circle of convergence, power 
series which converge nowhere on the circle of convergence, and power series 
that do just about anything in between. (See Problem 5.) 

Remember that our goal in this chapter is to produce differentiable func- 
tions. We therefore want to generalize the result proved for real power series 
in Chapter 23, that a function defined by a power series can be differentiated 
term-by-term inside the circle of convergence. At this point we can no longer 
imitate the proof of Chapter 23, even if we were willing to introduce uniform 
convergence, because no analogue of Theorem 23-3 seems available. Instead 
we will use a direct argument (which could also have been used in Chapter 23). 
Before beginning the proof, we notice that at least there is no problem about 
the convergence of the series produced by term-by-term differentiation. If the 


series Σ anz”" has radius of convergence R, then Theorem 4 immediately 
n=0 


οο 


implies that the series Σ πα, 2. 1 also converges for |z| < R. Moreover, if 
n=] 


lz| > R, so that the terms a,z” are unbounded, then the terms na,z”"—} are 


οο 


surely unbounded, so > nayz” 1 does not converge. This shows that the 


n=l 
οο 


radius of convergence of > nanz”~* is also exactly R. 


n=l 


If the power series 


οο 


f(z) = age” 


n=9 


has radius of convergence R > 0, then f is differentiable at z for all z with 
[Ζ] « R, and 


f(z) = > fag, 


468 


Infimte Sequences and Infinite Sertes 


PROOF 


We will use another “‘¢/3 argument.”’ The fact that the theorem is clearly true 
for polynomial functions suggests writing 


wR M9 δι τη ΟΠ OBER Payer 


nn δ : mn. » - 
ἽΝ κε τὴς θ᾽ 2) - Ya, ἘΠ 9 


ἘΣ ect Saar 


=1 


+ Σ na,z” : -- Σ ἡ 1}: 


ΞΕ] 


We will show that for any € > 0, each absolute value on the right side can be 
made < ¢/3 by choosing N sufficiently large and ἢ sufficiently small. This will 
clearly prove the theorem. 

Only the first term in the right side of (*) will present any difficulties. To 
begin with, choose some 29 with [2] < |z0| < R; henceforth we will consider 
only A with |z + ἢ] < [20]. The expression ((z + 4)” — 2")/A can be written 
in a more convenient way if we remember that 


Sy = yt + xr dy ἘΣ χ ἢ 9 ze ΚΕ 21, 
x — y 


Applying this to 
{τ} πε τι ΞΡ ay 


h (eth - δ΄ 
we obtain 
ery = (z Ἢ Ayn + z(z + hj»? + ee ae Fe + gn, 
Since 
|(z af i + Zz + μ)ν 2 tee - zh < η|2 015 , 
we have 


n τ ἢ ἀν zo". 


{2} ἢ} = 5) | 
h 


But the series > παρ] " [2015 1 converges, so if N is sufficiently large, then 
n=1 

oa 9 

nlan| " |Zo|" > < 3 


n=N-+1 


Complex Power Series 469 


This means that 


Jy eee 
= ἡ 


h 


n=0 


= | Σ Za hy 2") », (Phy = 2") 
= ayn “-- -- “ | < a, ~—____— 
h h 
n=N+1 n=N+1 
< nlan| ‘ za < : 
n=N+1 


In short, if NV is sufficiently large, then 


(et At — 2) Ὁ (eA — 2") 
[Seeman Seaman 


n=0 n=0 


ἘΞ 
2 


for all h with |z + ἃ] < [20]. 


c+] 


It is easy to deal with the third term on the right side of (*): Since Σ Nanz” 
n=1 


converges, it follows that if NV is sufficiently large, then 


< 


2) 
3 


ΠῚ Ν 
(2) | > rane -- > na,Zz” 1} 
n=] . n=1 


Finally, choosing an Ν such that (1) and (2) are true, we note that 


N PA mn N 
ha ge ee » nage”) 


h-0 h 
0 


= n=1 


N 
since the polynomial function g(z) = > a,z" is certainly differentiable. 


Therefore 


(3) | y AS i a Ξ ἘΞ Σ nape”) 


for sufficiently small Δ. 


As we have already indicated, (1), (2), and (3) prove the theorem. ff 


Theorem 6 has an obvious corollary: a function represented by a power 
series is infinitely differentiable inside the circle of convergence, and the power 
series is its Taylor series at 0. It follows, in particular, that f is continuous inside 
the circle of convergence, since a function differentiable at z is continuous at z 
(Problem 25-8). 


470 Infinite Sequences and Infinite Series 


FIGURE 5 


The continuity of a power series inside its circle of convergence helps explain 
the behavior of certain Taylor series obtained for real functions, and gives the 
promised answers to the questions raised at the end of Chapter 23. We have 
already seen that the Taylor series for the function f(z) = 1/(1 + z?), namely, 


1 -- χ' - χί - ιχζ Ἐ .., 


converges for real z only when |z| < 1, and consequently has radius of con- 
vergence 1. It is no accident that the circle of convergence contains the two 
points 7 and —7 at which / is undefined. If this power series converged in a 
circle of radius greater than 1, then (Figure 5) it would represent a function 
which was continuous in that circle, in particular at 7 and —z. But this is 
impossible, since it equals 1/(1 + z?) inside the unit circle, and 1/(1 + z?) does 
not approach a limit as z approaches 7 or —2 from inside the unit circle. 

The use of complex numbers also sheds some light on the strange behavior 
of the Taylor series for the function 


δ Ἐς x τέ 0 


Lay 0, 20. 


Although we have not yet defined e? for complex z, it will presumably be true 
that if y is real and unequal to 0, then 


f (zy) = el iy)? = elly? 


The interesting fact about this expression is that it becomes large as y becomes 
small. Thus f will not even be continuous at 0 when defined for complex 
numbers, so it is hardly surprising that it is equal to its Taylor series only for 
z= 0. 

The method by which we will actually define e? (as well as sin z and cos 2) 
for complex z should by now be clear. For real x we know that 


a 
Sea ἐν τ oad : 
a 5 
COS 4053 Δ ἘΞ a Pee 
ZY A 
x a 
6 τοῦδ 
For complex z we therefore define 
ἬΝ 
Ey eee = 9 
δὺς 2.0: 
cosz=1—-—+4+ -- , 
21] 
Zz a 
exp(z) = ¢ = πη τ 


Then sin'(z) = cos z, cos’(z) = —sin 2, and exp’(z) = exp(z) by Theorem 6. 


Complex Power Series 471 


Moreover, if we replace z by zz in the series for e?, something particularly 
interesting happens: 


ya \2 
pepe 


2) zy s. 2 Σ}" 
δι 3.4 ΡΓ 
25 1χ8 oe τ. 


2 i 3 5 
((-Ξ:Φ- --γεὐ{- Ξε ξι... 


(this step can be justified by the fact that the series converges absolutely), so 


I 
=o 
ν 

| 

| 

| 

| 
+ 

| 
ΕΞ 

| 
ΠΣ 


e* = cos z +7 sin Ζ. 
It is clear from the definitions (i.e., the power series) that 


sin(—z) = — sin 2, 
cos(—z) = cos Ζ, 
so we also have 
e* = cos z — 2 sin z. 


From the equations for e** and e~*? we can derive the formulas 


; ez oes gz 

sin z = οἱ ) 

er ~- et 

cos 2 Ξ OU 
2 


The development of complex power series thus places the exponential function 
at the very core of the development of the elementary functions—it reveals a 
connection between the trigonometric and exponential functions which was 
never imagined when these functions were first defined, and which could never 
have been discovered without the use of complex numbers. As a by-product of 
this relationship, we obtain a hitherto unsuspected connection between the 
numbers ὁ and 7: if in the formula 


e® = cos z+ sin z 


we take z = 7, we obtain the remarkable result 


With these remarks we will bring to a close our investigation of complex 
functions. And yet there are still several basic facts about power series which 
have not been mentioned. Thus far, we have seldom considered power series 
centered at a, 


2 


fz) = ) anlz — a)", 


n=0 


except for a = 0. This omission was adopted partly to simplify the exposition. 


472 Infinite Sequences and Infinite Series 


FIGURE 6 


ῃ 


For power series centered at a there are obvious versions of all the theorems in 
this chapter (the proofs require only trivial modifications): there is a number R 


- (possibly 0 or ‘‘«”’) such that the series Σ a,(z -- a)” converges absolutely 


n=0 
for z with |z — αἱ < R, and has unbounded terms for z with |z — αἱ > R; 
moreover, for all z with |z — αἱ < R the function 
fiz) = Qn(zZ— a)” 
2 


has derivative 


οο 


f(z) = > na,(z — a)", 
n=1 
It is less straightforward to investigate the possibility of representing a func- 
tion as a power series centered at ὦ, if it is already written as a power series 
centered at a. If 


οο 


f(z) -  an(z -- α)" 
n=0 
has radius of convergence R, and ὦ isa point with |b — αἱ « R (Figure 6), then 
it is true that f(z) can also be written as a power series centered at ὁ, 


joe ) ei 


(the numbers 6, are necessarily f‘” (b)/n!); moreover, this series has radius of 
convergence at least R — |b — αἱ (ct may be larger). 

We will not prove the facts mentioned in the previous paragraph, and there 
are several other important facts we shall not prove. If 


οο 


f(z) = » an(z— α) and g(z) = ᾿ b,(z -- a)” 
n=0 n=0 

converge in some circle around a, it is certainly to be hoped that f + g and 

f gwill also converge in a circle around a; moreover, if g(a) # 0, the function 

1/g should have a power series representation. Finally, if 


f(z) = Σ an(zg— a)” and g(z) = ) brlz -- 6)”, 
n=Q n=Q 
and g(b) = a, then it should be possible to write f o g as a power series centered 
at ὁ. 

All of these facts could be proved now without introducing any basic new 
ideas. The treatment of f + g is easy, while 7. g presents a few problems, and 
1/g presents even more problems. The possibility of changing a power series 
centered at a into one centered at ὁ becomes yet more involved, and the 
treatment of fo g requires real skill. Rather than end this section with a tour 


FIGURE 7 


FIGURE 8 


Complex Power Series 473 


de force of computations, we will instead give a preview of ‘“‘complex analysis,” 
one of the most beautiful branches of mathematics, where all these facts are 
derived as straightforward consequences of some fundamental results. 

Power series were introduced in this chapter in order to provide complex 
functions which are differentiable. Since these functions are actually infinitely 
differentiable, it is natural to suppose that we have therefore selected only a 
very special collection of differentiable complex functions. The basic theorems 
of complex analysis show that this is not at all true: 


If a complex function is defined in some region A of the plane and 1s differentiable in 
A, then it 1s automatically infinitely differentiable in A. Moreover, for each pornt a 
in A the Taylor series for f at a will converge to f in any circle contained in A 
(Figure 7). 


These facts are among the first to be proved in complex analysis. It is 
impossible to give any idea of the proofs themselves—the methods used are 
quite different from anything in elementary calculus. If these facts are 
granted, however, then the facts mentioned before can be proved very easily. 

Suppose, for example, that f and g are functions which can be written as 
power series. Then, as we have shown, ἢ and g are differentiable—it then 
follows from easy general theorems that f + g, f-g, 1/g and fog are also 
differentiable. Appealing to the results from complex analysis, it follows that 
they can be written as power series. 

Similarly, if 


i] 


f(z) = ) «κα -- a)” 
n=0 
has radius of convergence R, then f is differentiable in the region A = 
{z: |z — αἱ < ΚΒ). Thus, if 4 is in A, it is possible to write f as a power series 
centered at ὁ, which will converge in the circle of radius R — |b — αἰ. This 


series may actually converge in a larger circle, because > an(z — a)” may be 
n=0 


the series for a function differentiable in a larger region than A. For example, 
suppose that f(z) = 1/(1 + z?). Then f is differentiable, except at 2 and —z, 


where it is not defined. Thus f(z) can be written as a power series Σ ἠδ 


n=0 
with radius of convergence 1 (as a matter of fact, we know that de, = (—1)” 
and a, = 0 if ἃ is odd). It is also possible to write 


fle) = Σ bale — 8)", 
n=0 
where the numbers 4, are necessarily δ, = f‘”($)/n!. We can easily predict 


the radius of convergence of this series: it is V1 + (4)%, the distance from 4 
toz or —z (Figure 8). 


474 Infinite Sequences and Infinite Series 


As an added incentive to investigate complex analysis further, one more 
result will be mentioned, which lies quite near the surface, and which will be 
found in any treatment of the subject. 


For real z the values of sin z always lie between —1 and 1, but for complex 


z this is not at all true. In fact, if z = zy, for y real, then 


οἱ (ἐν) τὰν e ity) ον — ev 
sin iy = ———____—- = 


2 yg 


If y is large, then sin zy is also large in absolute value. This behavior of sin is 
typical of functions which are defined and differentiable on the whole complex 
plane (such functions are called entire). A result which comes quite early in 
complex analysis is the following: 


Liouville’s Theorem: The only bounded entire functions are the constant functions. 


As a simple application of Liouville’s Theorem, consider a polynomial 
function 


FC) ca ili ae a Ren  ϑης 


where n > 1, so that f is not a constant. We already know that f(z) is large for 
large z, so Liouville’s Theorem tells us nothing interesting about f. But con- 
sider the function 


1 


g(z) = flz) 


If f(z) were never 0, then g would be entire; since f(z) becomes large for large 
z, the function g would also be bounded, contradicting Liouville’s Theorem. 
Thus f(z) = 0 for some z, and we have proved the Fundamental Theorem of 
Algebra. 


PROBLEMS 


1. 


Decide whether each of the following series converges, and whether it 
converges absolutely. 


(1 +a)" 


n! 


(i) 


1 


bo Zt 
ae eres 


1 


(iii) Σ = 
n 


n=l 


8 Il 


8 il 


Complex Power Series 475 


(iv) Σ Care 


(v) yet ΕΠ 


Use the ratio test to show that the radius of convergence of each of the 
following power series is 1. (In each case the ratios of successive terms 
will approach a limit < 1 if |z| < 1, but for |z| > 1 the ratios will tend 
to οὐ or to a limit > 1.) 


Ze 
(i) aa 
n=l] 
οο 
n 
Gi) Yo. 
7 


(v) 5 phir aa 


n=1 


Use the root test (Problem 22-7) to find the radius of convergence of each 
of the following power series. 


.- ΖΦ 23 28 Zz z8 zé 
(i) Gat ae nah os, as ὦ 


niz™ 


The root test can always be used, in theory at least, to find the radius of 
convergence of a power series; in fact, a close analysis of the situation 


476 Infinite Sequences and Infinite Series 


leads to a formula for the radius of convergence, known as the “Cauchy- 
Hadamard formula.” Suppose first that the set of numbers Vv lan| is 
bounded. 


00 


(a) Use Problem 22-7 to show that if lim V lan| z <1, then > G2" 
N— 2% v= 0 


converges. 


[.] 


(0) Also show that if lim V [πα] z> 1, then > dnz” has unbounded 


NOD 
n=0 


terms. 


(c) Parts (a) and (b) show that the radius of convergence of ) Anz” 15 


n=0 


1/lim V lan! (where ‘‘1/0” means ‘‘”’). To complete the formula, 


n—> 0 


define lim V Jan = oo if the set of all V lan| is unbounded. Prove 


N—> © 
0 


that in this case, > a2" diverges for z + 0, so that the radius of 
n=0 
convergence is 0 (which may be considered as “‘1/’’). 


Consider the following three series from Problem 2: 


οο οο οο 
γι n 
vA Ζ 
ὲ ΞΕ) Ἐπ ἢ a 
71" n 
n=l1 n=1 n=1 


Prove that the first series converges everywhere on the unit circle; that 


the third series converges nowhere on the unit circle; and that the second 


series converges for at least one point on the unit circle and diverges for 

at least one point on the unit circle. 

(a) Prove that δ᾽" δῷ = e*t” for all complex z and w. (You just have to 
imitate the relevant parts of Problem 17-35.) 

(b) Show that sin(z + w) = sin z cos w + cos Ζ sin w and cos(z + w) 
= cos z cos w — sin z sin w for all complex z and w. 

(a) Prove that every complex number of absolute value 1 can be written 
e” for some real number y. 

(b) Prove that [67 Ὁ] = ¢* for real x and y. 

(a) Prove that exp takes on every complex value except 0. 

(b) Prove that sin takes on every complex value. 

(a) Show that a complex-valued function f = u + 10 satisfies the 

equation 

(*) 7.5) 4. ἀρ τὺ tee Ὁ aof = Ὁ, 

if and only if u and v do. Hint: Use Problem 25-9. 

Show that if ἃ = ὁ + εἷ is a complex root of the equation 2” + 

Qn” 1+ +++ +a) =0, then f(x) = e* since and f(x) = 

e>* cos x are both solutions of (*). 


(b 


᾿ς," 


10. 


11. 


12. 


(a) 
(b) 


*(c) 


Complex Power Series 477 


Show that exp is not one-one on Ὁ. 

Given w τέ 0, show that ο΄ = w if and only if z =x + 1 with 
x = log [ὦ] (here log denotes the real logarithm function), and y an 
argument of w. 

Show that there does not exist a continuous function log defined for 
nonzero complex numbers, such that exp(log(z)) = z for all z # 0. 


᾿ς (Show that log cannot even be defined continuously for [2] = 1.) 


(d) 
(e) 


(f) 


(a) 


(b) 


(b) 
(c) 


Since there is no way to define a continuous logarithm function we 
cannot speak of the logarithm of a complex number, but only of “a 
logarithm for w,”’ meaning one of the infinitely many numbers z with 
65 = το. 

Find all logarithms for 1. 

Find all values of 1ἷ, that is, of e where z is a logarithm of z. (The 
answers will be real numbers. ) 

Show that (1*)' has infinitely many possible values, while 1** has 
only one. 


For real x show that we can choose log(x + 7) and log(x — ἢ) to be 
log(x +7) = log(i + x?) +2 (ξΞ — arctan «), 
log(x — 7) = log(1 + x?) —2 ( — arctan .) 


(It will help to note that 7/2 — arctan x = arctan 1 /x for x ¥ 0.) 


The expression 


1 ec ( 1 _ 1 ) 
1-ἘχΧ 2\e-—72 xt 
yields, formally, 


1 
[ ᾿ τ aaa [log(« — ὃ — log(x + 2)]. 
Use part (a) to check that this answer agrees with the usual one. 


A sequence {a,} of complex numbers is called a Cauchy sequence 
if lim jam — ἀμ] = 0. Suppose that a, = by, + ἴσῃ, where 5, and 


Mm,n— © 


¢, are real. Prove that {a,} is a Cauchy sequence if and only if 
{bn} and {c,} are Cauchy sequences. 

Prove that every Cauchy sequence of complex numbers converges. 
Give direct proofs, without using theorems about real series, that an 
absolutely convergent series is convergent and that any rearrange- 
ment has the same sum. (It is permitted, and in fact advisable, to 
use the proofs of the corresponding theorems for real series.) 


478 Infinite Sequences and Infinite Series 


13. 


14. 


15. 


Prove the formula 
| i 
= + cos 8 + cos 20 + oo ieee ee 


2 sin — 
2 


by writing cos nO = (εἴπ + e~i"*)/2, summing the resulting geometric 
series, and multiplying top and bottom by 47 2, 
Let {a,} be the Fibonacci sequence, a1 = a2 = 1, Gnz2 = Qn 1 nq. 


(a) If rp = dn41/dn, show that rogi = 1 + 1/rp. 
(b) Show that 7 = lim r, exists, and r = 1+ 1/r. Conclude that 


na 0 


γε (1 Ἐ V5) /2. 
(c) Show that > anz” has radius of convergence 2/(1 + V 5). (Using 


n=1 
οο 


the unproved theorems in this chapter, and the fact that > age = 


n=1 
—1/(z? + z — 1) from Problem 23-8, we could have predicted that 
the radius of convergence is the smallest absolute value of the roots 
of z2 + z— 1 = 0; since the roots are (—1 + V5) /2, the radius 
of convergence should be (—1 + Vv 5) / 2. Notice that this number 
is indeed equal to 2/(1 + V5).) 
Since (οὐ — 1)/z can be written as the power series 1 + z/2! + 27/3! 
4+. + ++ which is nonzero at 0, it follows that if g(z) = z/(e? — 1) for 
z #0, and g(0) = 1, then g is differentiable inside some circle around 0. 
It then follows from the unproved theorem in this chapter that there is a 


power series 


Zz bn 

= —Z 

εὖ — Ἴ η 
n=0 


n 


with nonzero radius of convergence. We can even predict the radius of 
convergence; it is 27, since this is the smallest absolute value of the num- 
bers z = 2ἀπὶ for which e? — 1 = 0. The numbers 4, appearing here 
are called the Bernoulli numbers.* 


(a) Clearly bp = g(0) = 1. Now show that 


Bi. ue ΞΖ... ae cil a | 
one, Opal 
ie ma ΜΝ e+ 
ca ne ταν 


* Sometimes the numbers B, = (—1)"7!bon are called the Bernoulli numbers, because ὁ, = 0 
if nis odd and > 1 (see part ((a)) and because the numbers 62, alternate in sign, although we 
will not prove this. Other modifications of this nomenclature are also in use. 


Complex Power Series 479 


and deduce that 
δι = —4, δ, =0 ifnisoddandn>l. 


(b) By finding the coefficient of z” in the right side of the equation 


: bk Σ᾽ ze . 8 
aa a Gar aa a, 
k=0 


show that 
n—-1l 


Σ (2) = 0 for n > 1. 
1 


1=0 


This formula allows us to compute any J, in terms of previous ones, 
and shows that each is rational. Calculate two or three of the 
foHowing: 


5 = τ: ὃς = —305 ὃς = 5. ὃς = - 36. 
*(c) Part (a) shows that 


a see 9} aa itn ie 9 
(2n)} 2 ἔξ πὶ 2 432 -Ἣ εὖ 


zcotz = a Celle. 


*(d) Show that 
tan z = cot z — 2 cot 2z. 
*(e) Show that 


οο 


bon el = 
tan z= Σ age 2 ee γδόδτος 
(2n)! 


(This series converges for |z| < 7/2.) 


16. The Bernoulli numbers play an important role in a theorem which is 
best introduced by some notational nonsense. Let us use D to denote the 
“differentiation operator,” so that Df denotes ζ΄. Then D* f will mean 


f® and 97 f will mean ) 4 )κὶ (of course this series makes no sense in 
n=0 

general, but it will make sense if fis a polynomial function, for example). 

Finally, let A denote the “difference operator” for which Af(x) = 

f(x + 1) — f(x). Now Taylor’s Theorem implies, disregarding questions 


480 Infinite Sequences and Infinite Series 


of convergence, that 


οο 


(5) x 
τὴ = PEM 


n! 


n=0 


or 


πιὰ 


ni? 


(Ὁ) f(x + 1) — f(x) 
we may write this symbolically as Af = (e? — 1)f, where 1 is the 
‘identity operator.” Even more symbolically this can be written 
A = e? — 1, which suggests that 


D= 


eP — 1 


Thus we obviously ought to have 


p= δ᾽ = pia, 
ok! 
@) £0) =) FUME +1) —/POL 


k=0 
The beautiful thing about all this nonsense is that it works! 


(a) Prove that (**) is literally true if f is a polynomial function (in 
which case the infinite sum is really a finite sum). Hint: By applying 
(x) to f, find a formula for ¢“(« + 1) — f(x); then use the 
formula in Problem 15(b) to find the coefficient of f(x) in the 
right side of (**). 


(b) Deduce from (**) that 
flO) +--+ ἘΚ = δ πῖλος +1) τ PO! 


(c) Show that for any polynomial function g we have 
: Ok rk = 
cts: +e—= [gat Σ᾽ Ale +) τ 26 O) 
k=1 


(d) Apply this to g(x) = x? to show that 


n—1l p+l 


ΣΡ ." nPri ae Σ τί p eee 
p+i' φεκμ-| 


k=1 


*17. 


Complex Power Serves 481 


Using the fact that δι = --ὦ, show that 
n ἘΝ pt+l 
Waders ἘΞῈΣ Ἐ( Ρ ie 
δι 2 αὐ ε 
k=l k=? 


The first ten instances of this formula were written out in Problem 
2-6, which offered as a challenge the discovery of the general pattern. 
This may now seem to be a preposterous suggestion, but the Ber- 
noulli numbers were actually discovered in precisely this way! After 
writing out these 10 formulas, Bernoulli claims (in his posthumously 
printed work Ars Conjectandi, 1713): ‘‘Whoever will examine the 
series as to their regularity may be able to continue the table.”’ He 
then writes down the above formula, offering no proof at all, merely 
noting that the coefficients 5, (which he denoted simply by A, 8, 
C, ...) satisfy the equation in Problem 15(b). The relation be- 
tween these numbers and the coefficients in the power series for 
z/(e# — 1) was discovered by Euler. 


The formula in Problem 16(c) can be generalized to the case where g 15 
not a polynomial function; the infinite sum must be replaced by a finite 
sum plus a remainder term. In order to find an expression for the 
remainder, it is useful to introduce some new functions. 


(a) The Bernoulli polynomials φῃ are defined by 


Pr(x) = Σ " by —Kx*. 


k=0 
The first three are 
1 


χΧχπ - 


2 


I 


£1(x) 
1 
G2(x) = x — x +> τ᾽ 
23χ χ 
ΞΕ ee τις ὍΝ 
~3(x) = x Ἔ ; 
Show that 
Pn(0) -- bn, 
Pn(1) ἙΞΗ͂Σ a ifn > 1, 
Pn (x) = NOaaai(X)s 
on(x) = (--1)ῆφ,(( — x) forn > 1. 


Hint: Prove the last equation by induction on 2, starting with n = 2. 


482 Infinite Sequences and Infinite Series 


(b) Let Ry*(x) be the remainder term in Taylor’s formula for f™, on the 
interval [x, x + 1], so that 


N 
+n 
(x) f(x +. 1) — f®)(x) ἘΞΞ ΣΕ 9 + Ry*(x). 


n! 
n=0 
Prove that 
τ Ἢ ὁ 
f'lx) = Σ [f(x + Ὁ = fG)] -- = vt Ry a). 
k=0 k=0 


Hint: Imitate Problem 16(a). Notice the subscript N — ἀ on R. 
(c) Use the integral form of the remainder to show that 


N 
b “Tl on(x +1—2) wy 
Rye) = [ Ove) ΜΝ FNED (ἢ dt, 
k=0 


(d) Deduce the “‘Euler-Maclaurin Summation Formula”: 
e(x) Fe(et1)+t--+ σία +n) 
N 
ztntt 
= [atte YA fea +n ἢ = gH] + Slam) 
2 Pps ae 


where 
= zx+j+1 —_ a 
Syn) = = Σ ἡ ᾿ Cu ITED δὼ dt. 
᾿ = “Ὁ Ν! 
Ξ 


(e) Let Ψψ, be the periodic function, with period 1, which satisfies 
Walt) = ¢n(t) for 0 <¢ « 1. (Part (a) implies that if n > 1, then 
Wn is continuous, since ¢,(1) = ¢,(0), and also that ¢, is even if n is 
even and odd if n is odd.) Show that 


ztn+l . ws 
Sw(ayn) = = PEED peony a 


ztn+l1 
(- (--Ἠ ἡ) πὶ | PH θυ dt if xis an integer). 


Unlike the remainder in Taylor’s Theorem, the remainder Sy(x, n) usually 
does not satisfy lim Sy(x,n) = 0, because the Bernoulli numbers and func- 
N- ὦ 


tions become large very rapidly (although the first few examples do not suggest 
this). Nevertheless, important information can often be obtained from the 
summation formula. The general situation is best discussed within the context 
of a specialized study (‘‘asymptotic series’), but the next problem shows one 
particularly important example. 


Complex Power Series 483 


**18. (4) Use the Euler-Maclaurin Formula, with N = 2, to show that 


n W(t) 
2t? 


: 1 
log 1+ + tog αι -- 1) = [tog edt — 5 log n + dt. 
1 2 


(b) Show that 
πὶ ae ᾿ W(t) 
od cc) a aa 


(c) Explain why the improper integral B = i} ” Wo(t)/2t? dt exists, and 
show that if a = 45, then 


n} = - W2(t) 
me (πὶ) Se oe 


n 


(d) Problem 18-26(d) shows that 


Vr = lim τ τ" : 


Use part (c) to show that 


Py : en 2h t lg— 207 20 
π = lim 


nove on(2n) 2h 1136 35 Vn 
and conclude that a = V 27. 


(e) Show that the maximum value of |¢2(x)| for x in [0, 1] is 4, so that 
| Yo(t) 4 < eo 
n 2t? 12n 


MJ Dap nrtdey nie < nl < V2 nmtag nti ier, 


Conclude that 


The final result of Problem 18, a strong form of Stirling’s Formula, shows 
that n! is approximately V2 n®t1/%-™ in the sense that this expression differs 
from n! by an amount which is small compared to n when ἢ is large. For exam- 
ple, for n = 10 we obtain 3598696 instead of 3628800, with an error <1%. 

A more general form of Stirling’s Formula illustrates the ‘‘asymptotic”’ 
nature of the summation formula. The same argument which was used in 
Problem 18 can now be used to show that for N > 2 we have 


N 
| cs) 
log (5) : reer + f bul) 
VJ Oa nntl/%g—n K(k — 1)n* 1 n Nit 


k=2 


484 Infinite Sequences and Infinite Series 


Since Wy is bounded, we can obtain estimates of the form 


rr αι « My 


n N! iv nN-l 


If V is large, the constant My will also be large; but for very large n the factor 
ni-% will make the product very small. ‘Thus, the expression 


N . 
V On nr tllry—n ‘exp (> bx ) 
k(k — 1)n*7} 


k=2 


may be a very bad approximation for n! when n is small, but for large n (how 
large depends on WN) it will be an extremely good one (how good depends on N), 


There was a most ingenious Architect 
who had contrived a new Method 
for building Houses, 

by beginning at the Roof, and working 
downwards to the Foundation. 


JONATHAN SWIFT 


part @ 


EPILOGUE 


CHAPTER 


FIELDS 


Throughout this book a conscientious attempt has been made to define all 
important concepts, even terms like ‘“‘function,”’ for which an intuitive defini- 
tion is often considered sufficient. But Q and R, the two main protagonists of 
this story, have only been named, never defined. What has never been defined 
can never be analyzed thoroughly, and “‘properties’” P1-P13 must be con- 
sidered assumptions, not theorems, about numbers. Nevertheless, the term 
‘‘axiom”’ has been purposely avoided, and in this chapter the logical status 
of P1—P13 will be scrutinized more carefully. 

Like Q and R, the sets N and Z have also remained undefined. True, some 
talk about all four was inserted in Chapter 2, but those rough descriptions are 
far from a definition. To say, for example, that N consists of 1, 2, 3, etc., 
merely names some elements of N without identifying them (and the “‘etc.”’ is 
useless). The natural numbers can be defined, but the procedure is involved 
and not quite pertinent to the rest of the book. The Suggested Reading list 
contains references to this problem, as well as to the other steps that are 
required if one wishes to develop calculus from its basic logical starting point. 
The further development of this program would proceed with the definition 
of Z, in terms of N, and the definition of Q in terms of Z. This program results 
in a certain well-defined set Q, certain explicitly defined operations + and °, 
and properties P1-P12 as theorems. The final step in this program is the con- 
struction of R, in terms of Q, It is this last construction which concerns us. 
Assuming that Q has been defined, and that P1-P12 have been proved for Q, 
we shall ultimately define R and prove all of P1—-P13 for R. 

Our intention of proving P1-P13 means that we must define not only real 
numbers, but also addition and multiplication of real numbers. Indeed, the 
real numbers are of interest only as a set together with these operations: how 


the real numbers behave with respect to addition and multiplication is crucial; 


what the real numbers may actually be is quite irrelevant. This assertion can 
be expressed in a meaningful mathematical way, by using the concept of a 
‘field,’ which includes as special cases the three important number systems 
of this book. This extraordinarily important abstraction of modern mathe- 
matics incorporates the properties P1-P9 common to Ὁ, R, and C. A field is 
a set F (of objects of any sort whatsoever), together with two “‘binary oper- 
ations” -+ and « defined on F (that is, two rules which associate to elements 
a and ὁ in F, other elements a + ὁ and α " ὁ in F) for which the following con- 
ditions are satisfied: 


(1) (a+b) +#-e=at(b6+c) forall a, ὁ, andc in F. 
(2) There is some element 0 in F such that 


(Gi) a-+O-=a_ forall ain ἢ, 
(ii) for every a in F, there is some element ὁ in F'such thata - ὁ = 0. 


487 


488 Epilogue 


(33)a+b=6+a forallaanddinF. 
(4)°(@*b) <= a" (b+c) for all a, 6, and ς in Ff. 
(5) There is some element 1 in F such that 1 # 0 and 


(1) a-1=a forallainF, 
(ii) For every a in F with a ¥ 0, there is some element ὁ in F’ such 
thata:b = 1. 


(6) a-b =b6:a forall aand ὁ ἴῃ F. 
(7) α΄ (bc) =a:b+a-c foralla, 6, andc in F. 


The familiar examples of fields are, as already indicated, Q, R, and C, with 
+ and - being the familiar operations of + and -. It is probably unnecessary 
to explain why these are fields, but the explanation is, at any rate, quite brief. 
When + and - are understood to mean the ordinary + and °, the rules 1, 3, 
4, 6, 7 are simply restatements of P1, P4, P5, P8, P9; the elements which play 
the role of 0 and 1 are the numbers 0 and 1 (which accounts for the choice of 
the symbols 0, 1); and the number ὁ in (2) or (5) is —a or a™, respectively. 
(For this reason, in an arbitrary field F we denote by —a the element such that 
a + (—a) = 0, and by a the element such that a+ a™ = 1, fora # 0.) 

In addition to Q, R, and C, there are several other fields which can be 
described easily. One example is the collection Fi of all numbers a + 6 V2 
for a, ὁ in Q, The operations ++ and - will, once again, be the usual + and ° 
for real numbers. It is necessary to point out that these operations really do 
produce new elements of δὶ: | 


(a + 6V2) + (Ὁ ΑΝ) = (a+) + (ὁ ὦ V2, which is in Fi; 
(a ὃ Ν 2). (ες +d V2) = (ac + 2bd) + (bc + ad) V2, which is in Fi. 


Conditions (1), (3), (4), (6), (7) for a field are obvious for Fi: since these hold 
for all real numbers, they certainly hold for all real numbers of the form 
at b V2. Condition (2) holds because the number 0 = 0 + 0 V2 isin Fi 
and, for a =a+6V2 in F, the number β = (—a) + (—6) V2 in Fi 
satisfies a + 6 = 0. Similarly, 1 = 1+ 0 V2 is in Fi, so (51) is satisfied. 
The verification of (511) is the only slightly difficult point. If a + ὁ V2 #0, 


then 
1 


at+bv2 


it is therefore necessary to show that 1/(a + ὁ V2) is in F,. This is true 
because 


pp NO 1; 


wt Ne ys AOD os i κὰν 
LEV. GHtVOG GIN λ τὰ τε εν. aaa ς 


(The division by a — 6 V2 is valid because the relation a — ὁ V2 = 0could 


Fields 489 


be true only if a = ὁ = 0 (since V2 is irrational) which is ruled out by the 
hypothesis a + 6 V2 #0.) 

The next example of a field, ἐς, is considerably simpler in one respect: it con- 
tains only two elemenis, which we might as well denote by Ὁ and 1. The 
operations -- and - are described by the following tables. 


+o 1 - oOo 1 
0 0; 0 | 0 
1 101 


The verification of conditions (1)-(7) are straightforward, case-by-case 
checks. For example, condition (1) may be proved by checking the 8 equations 
obtained by setting a, 6, c = 0 or 1. Notice that in this field 1 + 1 = 0; this 
equation may also be written 1 = —1. 

Our final example of a field is rather silly: F; consists of all pairs (a, a) for 
ain R, and + and - are defined by 


(α,. 4) “Ἔ (0,0). = @ = ὁ. ἃ Ἔ ὁ), 
(α, α) + (b, δ) = (α b, a: 6). - 


(The + and " appearing on the right side are ordinary addition and multipli- 
cation for R.) The verification that F3 is a field is left to you as a simple 
exercise. 

A detailed investigation of the properties of fields is a study in itself, but 
for our purposes, fields provide an ideal framework in which to discuss the 
properties of numbers in the most economical way. For example, the conse- 
quences of P1—P9 which were derived for ‘‘numbers” in Chapter 1 actually 
hold for any field; in particular, they are true for the fields Q, R, and C. 

Notice that certain common properties of Q, R, and C do not hold for all 
fields. For example, it is possible for the equation 1 -- 1 = 0 to hold in some 
fields, and consequently a— 6 = 6—a does not necessarily imply that 
a = b. For the field C the assertion 1 + 1 # 0 was derived from the explicit 
description of C; for the fields Q and R, however, this assertion was derived from 
further properties which do not have analogues in the conditions for a field. 
There is a related concept which does use these properties. An ordered field 
is a field F (with operations -++ and 5) together with a certain subset P of F 
(the “positive? elements) with the following properties: 


(8) For all a in F, one and only one of the following is true: 
(i) a= 0, 
(ii) a is in P, 
(iii) —a is in P. 
(9) If a and ὁ are in P, then a + ὁ is in P. 
(10) If a and ὁ are in P, then a: ὦ is in P. 


490 Epilogue 


We have already seen that the field C cannot be made into an ordered field. 
The field ἐς, with only two elements, likewise cannot be made into an ordered 
field: in fact, condition (8), applied to 1 = —1, shows that 1 must be in P; 
then (9) implies that 1 -+- 1 = 0 is in P, contradicting (8). On the other hand, 
the field δὶ, consisting of all numbers a + ὁ V2 with a, bin Q, certainly can 
be made into an ordered field: let P be the set of all a + ὁ V2 which are 
positive real numbers (in the ordinary sense). The field δ can also be made 
into an ordered field; the description of P 1s left to you. 

It is natural to introduce notation for an arbitrary ordered field which 
corresponds to that used for Q and R: we define 


a>b if a—bisinP, 
a<6 if b> a, 

axb if a<obora 
a>b Wfa>dbora 


b, 
δ. 


Using these definitions we can reproduce, for an arbitrary ordered field F, 
the definitions of Chapter 7: 


A set A of elements of F is bounded above if there is some x in F such that 
x > a for all ain A. Any such x is called an upper bound for A. An ele- 
ment x of F is a least upper bound for A if x is an upper bound for A and 
x < y for every y in F which is an upper bound for 4. 


Finally, it is possible to state an analogue of property P13 for R; this leads to 
the last abstraction of this chapter: 


A complete ordered field is an ordered field in which every nonempty 
set which is bounded above has a least upper bound. 


The consideration of fields may seem to have taken us far from the goal of 
constructing the real numbers. However, we are now provided with an 
intelligible means of formulating this goal. There are two questions which 
will be answered in the remaining two chapters: 


1. Is there a complete ordered field? 
2. Is there only one complete ordered field? 


Our starting point for these considerations will be Q, assumed to be an 
ordered field, containing N and Z as certain subsets. At one crucial point it 
will be necessary to assume another fact about Q: 


Let x be an element of Q with x 0. Then for any y in Q there is some n 
in N such that nx > y. 


This assumption, which asserts that the rational numbers have the Archi- 
median property of the reals, does not follow from the other properties of an 
ordered field (for the example that demonstrates this conclusively see [17]). 
The important point for us is that when Q is explicitly constructed, proper- 


Fields 491 


ties P1-P12 appear as theorems, and so does this additional assumption; 
if we really began from the beginning, no assumptions about Q would be 
necessary. 


PROBLEMS 


1. Let F be the set {0, 1, 2} and define operations -- and - on F by the 
following table. (The rule for constructing this table is as follows: add or 
multiply in the usual way, and then subtract the highest possible multiple 
of 3; thus 2:2 =4=3+1,s02-2 = 1.) 


Show that Fis a field, and prove that it cannot be made into an ordered 
field. 

2. Suppose now that we try to construct a field F having elements 0, 1, 2, 3 
with operations + and - defined as in the previous example, by adding 
or multiplying in the usual way, and then subtracting the highest possible 
multiple of 4. Show that / will not be a field. 

3. Let F = {0, 1, a, 8} and define operations -- and: on F by the follow- 
ing tables. 


Show that F is a field. 


4, (a) Let F be a field in which 1 +1 = 0. Show that a + a = 0 for all 
a (this can also be written a = —a). 
(b) Suppose thata Ἔ a = Oforsomea ~ 0. Show that 1 + 1 = 0 (and 
consequently ὁ + ὁ = 0 for all ὁ). 


492 Epilogue 


10. 


(a) Show that in any field we have 


m times n times mn times 


for all natural numbers m and n. 
(Ὁ) Suppose that in the field F we have 


i+---+1=0 


m times 


for some natural number m. Show that the smallest m with this 
property must be a prime number (this prime number is called the 
characteristic of F). 

Let F be any field with only finitely many elements. 


(a) Show that there must be distinct natural numbers m and n with 


1+ - tt =idp--- +1. 


m times n times 


(Ὁ) Conclude that there is some natural number ἀ with 


1- τ ἘἘ1.- 0. 


Κ times 


Let a, 6, c, and d be elements of a field F with a-d— δ το τέ 0. Show 
that the equations 
a‘xtb-y=a, 
coxtd-y = β, 


can be solved for x and y. | 
Let a be an element of a field F. A “square root” of a is an element ὁ of 
F with 6? = ὁ Ὁ =a. 


(a) How many square roots does 0 have? 
(b) Suppose a # 0. Show that a has two square roots, unless 1 + 1 = 0, 
in which case a has only one. 


(a) Consider an equation x? -+ b-x Ἔ c = 0, where ὁ and ¢ are elements 
of a field F. Suppose that b? — 4 +c has a square root r in F. Show 
that (—4 + 7)/2 is a solution of this equation. 

(Ὁ) In the field F, of the text, both elements clearly have a square root. 
On the other hand, it is easy to check that neither element satisfies 
the equation x? + x + 1 = 0. Thus some detail in part (a) must 
be incorrect. What 1s it? 

Let F be a field and a an element of F which does not have a square root. 

This problem shows how to construct a bigger field F’, containing F, in - 

which a does have a square root. (This construction has already been 

carried through in a special case, namely, F = R and a = —1; this 
special case should guide you through this example.) 


fields 493 


Let F’ consist of all pairs (x, y) with x and y in F. If the operations on F 
are -+ and -, define operations © and © on F’ as follows: 


(x,y) © (Zw) = «tz,y+), 
(x,y) © (4, ὦ) = (xr zfacy yw, y-z-r-w). 


(a) Prove that Ff’, with the operations @ and ©, is a field. 
(b) Prove that 

(x, 0) Φ ὦ, 0) = @& +, 9), 

(x, 0) © (y, 0) = («+ y, 0), 


so that we may agree to abbreviate (x, 0) by x. 
(c) Find a square root of ἃ = (a, 0) in F’. 


11. Let F be the set of all four-tuples (w, x, y, z) of real numbers. Define 
+ and: by 


(s, t, u,v) + (w, x, y, 2) = (SP Opt AE Kaba yO ee 3). 
(s, t, u,v) * (ὦ, x,y, 2) = (sw — tx — uy — v2, sx + tw + uz — vy, 
sy + uw + ux — tz, sz + vw + ty — ux). 


(a) Show that F satisfies all conditions for a field, except (6). At times 
the algebra will become quite ornate, but the existence of multi- 
plicative inverses 1s the only point requiring any thought. 

(b) It is customary to denote 


(0, 1, 0, 0) by 1, 

(0, 0, 1, 0) by J, 

(0, 0, 0, 1) by &. 
Find all 9 products of pairs 2, 7, and k. The results will show in par- 
ticular that condition (6) is definitely false. This “‘skew field” F is 
known as the quarternions. 


CHAPTER 


CONSTRUCTION OF THE 
REAL NUMBERS 


The mass of drudgery which this chapter necessarily contains is relieved by 
one truly first-rate idea. In order to prove that a complete ordered field exists 
we will have to explicitly describe one in detail; verifying conditions (1)—(10) 
for an ordered field will be a straightforward ordeal, but the description of the 
field itself, of the elements in it, is ingenious indeed. 

At our disposal is the set of rational numbers, and from this raw material 
it is necessary to produce the field which will ultimately be called the real 
numbers. To the uninitiated this must seem utterly hopeless—if only the 
rational numbers are known, where are the others to come from? By now we 
have had enough experience to realize that the situation may not be quite so 
hopeless as that casual consideration suggests. The strategy to be adopted in 
our construction has already been used effectively for defining functions and 
complex numbers. Instead of trying to determine the ‘‘real nature’’ of these 
concepts, we settled for a definition that described enough about them to 
determine their mathematical properties completely. 

A similar proposal for defining real numbers requires a description of real 
numbers in terms of rational numbers. The observation, that a real number 
ought to be determined completely by the set of rational numbers less than it, 
suggests a strikingly simple and quite attractive possibility: a real number 
might (and in fact eventually will) be described as a collection of rational 
numbers. In order to make this proposal effective, however, some means must 
be found for describing “‘the set of rational numbers less than a real number”’ 
without mentioning real numbers, which are still nothing more than heuristic 
figments of our mathematical imagination. 

If A is to be regarded as the set of rational numbers which are less than the 
real number a, then A ought to have the following property: If x isin A and y 
is a rational number satisfying y < x, then y is in A. In addition to this 
property, the set A should have a few others. Since there should be some 
rational number x < α, the set A should not be empty. Likewise, since there 
should be some rational number x > @, the set A should not be all of Q. 
Finally, if « <a, then there should be another rational number y with 
x <y <q, so A should not contain a greatest member. 

Tf we temporarily regard the real numbers as known, then it is not hard to 
check (Problem 8-17) that a set A with these properties is indeed the set of 
rational numbers less than some real number a. Since the real numbers are 
presently in limbo, your proof, if you supply one, must be regarded only as an 
unofficial comment on these proceedings. It will serve to convince you, how- 
ever, that we have not failed to notice any crucial property of the set 4. There 
appears to be no reason for hesitating any longer. 


494 


DEFINITION 


Construction of the Real Numbers 495 


A real number is a set a, of rational numbers, with the following four 
properties: 


(1) Ifx isin ἃ and yis a rational number with y < x, then y is also in a. 
(2) a # 9. 

| (3) α #Q. 

| (4) There is no greatest element in a; in other words, if x is in a, then 
| there is some y in ἃ with y > x. 
Ϊ 

πες πὲ ὰδνι. - 


The set of all real numbers is denoted by R. 


Just to remind you of the philosophy behind our definition, here is an 
explicit example of a real number: 


a = {xin Ὁ: x «0 or x? < 2}. 


It should be clear that @ is the real number which will eventually be known 
as V2, but it is not an entirely trivial exercise to show that @ actually is a 
real number. The whole point of such an exercise is to prove this using only 
facts about Q; the hard part will be checking condition (4), but this has already 
appeared as a problem in a previous chapter (finding out which one is up to 
you). Notice that condition (4), although quite bothersome here, is really 
essential in order to avoid ambiguity; without it both 


{xin Q:x < 1} 
and 
xin 9: χ < 1} 


would be candidates for the ‘‘real number 1.” 
The shift from A to @ in our definition indicates both a conceptual and a 
notational concern. Henceforth, a real number zs, by definition, a set of 


rational numbers. This means, in particular, that a rational number (a 


member of Q) is not a real number; instead every rational number x has δ΄ 
natural counterpart which is a real number, namely, {y in Q: y < x}. After 
completing the construction of the real numbers, we can mentally throw away 
the elements of Q and agree that Q will henceforth denote these special sets. 
For the moment, however, it will be necessary to work at the same time with 
rational numbers, real numbers (sets of rational numbers) and even sets of 
real numbers (sets of sets of rational numbers). Some confusion is perhaps 
inevitable, but proper notation should keep this to a minimum. Rational 
numbers will be denoted by lower case Roman letters (x, y, z, a, 6,¢) and 
real numbers by lower case Greek letters (a, 8, y); capital Roman letters 
(A, B, C) will be used to denote sets of real numbers. 

The remainder of this chapter is devoted to the definition of +, -, and P 
for R, and a proof that with these structures R is indeed a complete ordered 
field. 


496 Epilogue 


THEOREM 


PROOF 


We shall actually begin with the definition of P, and even here we shall 
work backwards. We first define a < B; later, when +, ", and 0 are available, 
we shall define P as the set of all a with 0 < a, and prove the necessary 
properties for P. The reason for beginning with the definition of < is the 
simplicity of this concept in our present setup: 


Definition. If a and B are real numbers, then a < β means that a is contained 
in 8 (that is, every element of ἃ is also an element of 8), but a κέ 8. 


A repetition of the definitions of <, >, 2 would be stultifying, but it is 
interesting to note that < can now be expressed more simply than <; if a and 
β are real numbers, then a < β if and only if ἃ is contained in 8. 

If A is a bounded collection of real numbers, it is almost obvious that A 
should have a least upper bound. Each a in A is a collection of rational num- 
bers; if these rational numbers are all put in one collection 8, then β is pre- 
sumably sup A. In the proof of the following theorem we check all the little 
details which have not been mentioned, not least of which is the assertion that 
β is a real number. (We will not bother numbering theorems in this chapter, 


since they all add up to one big Theorem: There is a complete ordered field.) 


If A is a set of real numbers and A ¥ @ and A is bounded above, then A hasa 
least upper bound. 


Let 8 = {x: x is in some ἃ in A}. Then is certainly a collection of rational 
numbers; the proof that 6 is a real number requires checking four facts. 


(1) Suppose that x is in 6 and y < x. The first condition means that «x is 
in a for some ἃ in A. Since ἃ is a real number, the assumption y < * 
implies that y is in a. Therefore it is certainly true that y is in β. 

(2) Since A + , there is some @ in A. Since ἃ is a real number, there is 
some x in a. This means that χ is in β, so β = 9. 

(3) Since A is bounded above, there is some real number Ὕ such that 
a < vy for every α in A. Since y is a real number, there is some 
rational number x which is not in y. Now a < y means that a 15 
contained in ¥, so it is also true that x is not in @ for any α in A. This 
means that x is not in 6; so 8 # Q, . 

(4) Suppose that x is in 6. Then x is in a for some α in A. Since ἃ does 
not have a greatest member, there is some rational number y with 
x <y and yin α. But this means that y is in 8; thus 6 does not have 
a greatest member. 


These four observations prove that β is a real number. The proof that B is 
the least upper bound of 4 is easier. If a is in A, then clearly ἃ is con- 
tained in @; this means that a < 8, so Bis an upper bound for A. On the other 
hand, if y is an upper bound for A, then a < ¥ for every α in A; this means 


THEOREM 


PROOF 


THEOREM 


PROOF 


THEOREM 


PROOF 


Construction of the Real Numbers 497 


that ἃ is contained in Ύ, for every α in A, and this surely implies that 8 is con- 
tained in y. This, in turn, means that 6B < γ; thus β is the least upper bound 
of A.J 


The definition of -+ is both obvious and easy, but it must be complemented 
with a proof that this ‘“‘obvious’’ definition makes any sense at all. 


Definition. If a and βὶ are real numbers, then 


α Ἔκ = {x: x = y+ z for some y in ἃ and some z in 8}. 
If a and β are real numbers, then a + β is a real number. 


Once again four facts must be verified. 


(1) Suppose ὦ < x for some x in a + 8. Then x = y + z for somey ina 
and some z in 6, which means that w < y + z, and consequently, 
w — y < z. This shows that w — y is in 6 (since z is in β, and β is a 
real number). Since w = y + (ὦ — y), it follows that w isin a + 8. 

(2) It is clear that a + β τέ 6, sincea # J and BSH ᾧ. 

(3) Since a ¥ Q and β ¥ Q, there are rational numbers a and ὦ with a 
not in ἃ and ὦ not in 8. Any α in a satisfies x < a (for if a < x, then 
condition (1) for a real number would imply that a is in a); similarly 
any y in β satisfies y « ὁ. Thus x + y < a+ ὁ for any x ina and y 
in 6. This shows that a + ὦ is not ina+ 6, soa+8 #Q. 

(4) If x is in a+ 6, then x = y + z for y in a and z in β. There are 
y’ in @ and z’ in B with y < y’ and z < 2’; then x < y’ + 2’ and 
γ' + 2’ is in a+ B. Thus a + B has no greatest member. J 


By now you can see how tiresome this whole procedure is going to be. Every 
time we mention a new real number, we must prove that it 7s a real number; 
this requires checking four conditions, and even when trivial they require 
concentration. There is really no help for this (except that it will be less 
boring if you check the four conditions for yourself). Fortunately, however, a 
few points of interest will arise now and then, and some of our theorems will be 
easy. In particular, two properties of ++ present no problems. 


If a, 8, and Ὕ are real numbers, then (a + 8) #y =act (6+). 


Since (x + y) + z = x + (ἡ + 2) for all rational numbers x, y, and z, every 
member of (a + 8) + vy is also a member of a + (6 + γ), and vice versa. ἢ 


If a and 6 are real numbers, then a + 8 = B+ a. 


Left to you (even easier). J 


498 Epilogue 


THEOREM 


PROOF 


THEOREM 


PROOF 


To prove the other properties of -+ we first define 0. 
Definition. Ὁ = {x in Q: x < 0). 


It is, thank goodness, obvious that 0 is a real number, and the following 
theorem is also simple. 


If α is a real number, then a + 0 = a. 


If x is in ἃ and y is in 0, then y < 0, sox + y < x. This implies that x + y is 
in a. Thus every member of a + 0 is also a member of a. | 

On the other hand, if x is in a, then there is a rational number y in @ such 
that y > x. Since x = y + (x — y), where y is in a, and x — y < 0 (so that 
x το γ isin 0), this shows that x is in a -ξ 0. Thus every member of ἃ is also a 
member of a + 0. ff 


The reasonable candidate for —a would seem to be the set 
{x in Q: —x is not in a} 


(since —x not in a means, intuitively, that —x > a, so that x < --αὐ). But in 
certain cases this set will not even be a real number. Although a real number 
a does not have a greatest member, the set 


Ω -- οΟ = {x inQ: x is not in a} 


may have ἃ /east element x9; when @ 1s a real number of this kind, the set 
{x: —x is not in a} will have a greatest element —Xo. It is therefore necessary 
to introduce a slight modification into the definition of —a, which comes 
equipped with a theorem. 


Definition. If a is a real number, then 


—a = {xin Q: —x is not in a, but —x is not the least element of Q — αἹ. 
If a is a real number, then —a is a real number. 


(1) Suppose that x isin —a andy < x. Then —y > —x. Since —x is not 
in a, it is also true that —y is not in a. Moreover, it is clear that —y is 
not the smallest element of Q — a, since —x is a smaller element. 
This shows that y is in —a. 

(2) Since a ¥ Q, there is some rational number y which is not in @. We 
can assume that y is not the smallest rational number in Q — a@ 
(since y can always be replaced by any y’ > y). Then —y is in —a@. 
Thus —a + @. 

(3) Since a # @, there is some x in a. Then —x cannot possibly be in =a, 
so “πὰ ~ Ὁ. 


LEMMA 


PROOF 


THEOREM 


PROOF 


Construction of the Real Numbers 499 


(4) If x isin —a, then —x is not in a, and there is a rational number 
y < —x which is also not in a. Let z be a rational number with 
y <z< —x. Then z is also not in a, and z is clearly not the smallest 
element of Q — a. So —z is in ma. Since —z > x, this shows that 
—a does not have a greatest element. J 


The proof that a -- (—a) = 0 is not entirely straightforward. The diffi- 
culties are not caused, as you might presume, by the finicky details in the 
definition of —a. Rather, at this point we require the Archimedian property of 
Q stated on page 490, which does not follow from P1-—P12. This property 
is needed to prove the following lemma, which plays a crucial role in the next 
theorem. 


Let ἃ be a real number, and z a positive rational number. ‘Then there are 
(Figure 1) rational numbers x in a, and y not in a, such that y — x = z. More- 
over, we may assume that y is not the smallest element of Q — a. 


Suppose first that z is in a. If the numbers 
ae Ae) a Δὸν 


were all in a, then every rational number would be in a, since every rational 
number w satisfies w < nz for some n, by the additional assumption on page 
490. This contradicts the fact that ἃ is a real number, so there is some ᾧ such 
that « = kz is in a and y = (k + 1)z is not in a. Clearly y — x = z. 

Moreover, if y happens to be the smallest element of Q — a, let x’ > x be 
an element of a, and replace x by x’, and y by y + («’ — x). 

If z is not in α, there is a similar proof, based on the fact that the numbers 
(—n)z cannot all fail to be in a. J 


Ζ 


fo 


BSEHSSSHSTSSSESSSSHSSSESHSSHSSHFHSSHHSEDHASCHHEHSOHOHESHOHLESESEHEHHSCEHS GHOSCEHSHSHHOHRHSSHSHSHSEHSSOSHHSESEESCEOS 


FIGURE 1 


If a is a real number, then 


a+ (—a) = 0. 


Suppose x is in a and y is in —a. Then —y is not in a, so —y > x. Hence 
x+y <0, so x+y is in 0. Thus every member of a + (aq) is in 0. 

It is a little more difficult to go in the other direction. If z is in 0, then 
—z > 0. According to the lemma, there is some x in @, and some y not in ἃ, 
with y not the smallest element of Q — a, such that y — x = —z. This equa- 
tion can be written x + (—y) = z. Since x is in a, and —y is in —a, this 
proves that z is in a + (—a). J 


500 Epilogue 


THEOREM 


PROOF 


THEOREM 


PROOF 


Before proceeding with multiplication, we define the “‘positive elements” 
and prove a basic property: 


Definition. P = {a in R: a > O}. 
Notice that a + β is clearly in P if a and β are. 


If @ is a real number, then one and only one of the following conditions holds: 


(i) a = 0, 
(ii) @ is in P, 
(iii) “τὰ is in P. 


If a contains any positive rational number, then ἃ certainly contains all nega- 
tive rational numbers, so a contains 0 and @ = 0, i.e., a is in P. If a contains 
no positive rational numbers, then one of two possibilities must hold: 


(1) α contains all negative rational numbers; then a = 0. 

(2) there is some negative rational number x which is not in a; it can be 
assumed that x is not the least element of Q — a (since x could be . 
replaced by x/2 > x); then —a contains‘the positive rational number 
—x, sO, as we have just proved, --τὰ is in P. 


This shows that at least one of (i)~(iii) must hold. If a = 0, it is clearly impossi- 
ble for condition (ii) or (iii) to hold. Moreover, it is impossible that a > 0 and. 
--α > 0 both hold, since this would imply that 0 = a + (—a) >0.f 


Recall that ἃ > β was defined to mean that ἃ contains β, but is unequal to 
8. This definition was fine for proving completeness, but now we have to show 
that it is equivalent to the definition which would be made in terms of P. 
Thus, we must show that a — 8 > 0 is equivalent to a > β. This is clearly a 
consequence of the next theorem. 


If a, 8, and y are real numbers and a > β, thena-+-+y > 8+ ¥. 


The hypothesis a > 8 implies that 8 is contained in a; it follows immediately 
from the definition of ++ that 8 + Ύ is contained in a + y. This shows that 
aty>B+y. We can easily rule out the possibility of equality, for if 


aty=Bb+y7, 
then 


a=(aty) + (—y) = (β Ἔ Ὑ) + (-y = 86, 
which is false. Thus a +y > 8+ γ.}1 


Multiplication presents difficulties of its own. If a, B > 0, then a: 6 can 
be defined as follows. 


Construction of the Real Numbers 501 


Definition. If a and 6 are real numbers and a, B > 0, then 


a+ 8 = {z:z < 0orz = x-y for some x in @ and y in B with x, y > 0}. 
THEOREM If @ and β are real numbers with a, 8 > 0, then a: βὶ is a real number. 


PROOF ΑΒ usual, we much check four conditions. 


(1) Suppose ὦ < z, where zisina- 8. Ifw < 0, then w is automatically 
in a+ 8. Suppose that ὦ > 0. Then z > 0, so z = x° y for some posi- 
tive x in a and positive y in 8. Now 


wz  wxy & ) 
we = =  [ ---  χ 7}. 
Ζ 


Since 0 < w « z, we have w/z < 1, so (ὦ 2) - χ is in a. Thus w is 
ina: β. 

(2) Clearly a+ B 50. 

(3) If x is not in a, and y is not in 8, then x > x’ for all x’ in a, and 
y > y’ for all y’ in B. Hence xy > x’y’ for all such positive x’ and η΄, 
So xy is not in a+ β; thus α" β ~Q, 

(4) Suppose w is in a+ 8, and τὸ < 0. There is some x in ἃ with x > 0 
and some y in 8 with y > 0. Then z = xy is in a@- 8 and z > w. 
Now suppose w > 0. Then w = xy for some positive x in a and some 
positive y in 8. Moreover, a contains some x’ > x; if z = x’y, then 
z > xy = ὦ, and z is in a: 8. Thus a+ 8 does not have a greatest 
element. J 


Notice that a: β is clearly in P if a and β are. This completes the verifica- 
tion of all properties of P. To complete the definition of " we first define lal. 


Definition. If a is a real number, then 


jal = | ifa>0O 
“κα, ἴὰ «0. 


Definition. If a and β are real numbers, then 


0, ifa =OorfB=0 
a-B = 4 fal > (BI, ifa>0,8>O0ora<0,8<0 
—(le]+ IS), ifa>0,8<O0ora<0,6>0. 


As one might suspect, the proofs of the properties of multiplication usually 
involve reduction to the case of positive numbers. 


THEOREM If a, 6, and y are real numbers, then a+ (6+ Ὑ) = (α - β) "Ύ. 


502 Epilogue 


PROOF 


THEOREM 


PROOF 


THEOREM 


PROOF 


THEOREM 


PROOF 


This is clear if a, 8, y > 0. The proof for the general case requires considering 
separate cases (and is simplified slightly if one uses the following theorem). J 


If a and β are real numbers, then a: 8B = 6°: a. 
This is clear if a, 8 > 0, and the other cases are easily checked. J 


Definition. 1 = {x in Q: x < 1}. 
(It is clear that 1 is a real number.) 


If @ is a real number, then a-1 = a. 


Let a > 0. It is easy to see that every member of a " 1 is also a member of @. 
On the other hand, suppose « is in a. If x < 0, then x is automatically in 
a+1. If x > 0, then there is some rational number y in @ such that x < y. 
Then x = γ’ (x/y), and x/y is in 1, sox isin a: 1. This proves thata-:1 =a 
ifa> 0. | 

If a < 0, then, applying the result just proved, we have 


a-1= —(la|-|1]) = —(lal) = a. 


Finally, the theoreni is obvious when a = 0. J 


Definition. If a is a real number and ἃ > 0, then 


α΄ = {xin Q: x < 0,orx > 0 and 1/x is not in a, but 1/x is not the smallest 
member of Q — a}; 


if a «0, then a7! = --([α[Γ ἢ). 
If α is a real number unequal to 0, then a is a real number. 


Clearly it suffices to consider only a > 0. Four conditions must be checked. 


(1) Suppose y < x, and x is in α΄. If y < 0, then y is in a~’. If y > 0, 
then x > 0, so 1/x is not in a. Since 1/y > 1/x, it follows that 1/y is 
not in a, and 1/y is clearly not the smallest element of Q — a, so y is 
in aM". 

(2) Clearly a # 9. 

(3) Since a > 0, there is some positive rational number x in α. Then 1/x 
is not in αὖ, so aa? ~ Ὁ. 

(4) Suppose x is in at. If x < 0, there is clearly some y in αὐ with 
y > x because a contains some positive rationals. If x > 0, then 
1/x is not in a. Since 1/x is not the smallest member of Q — a, there 
is a rational number y not in a, with y < 1/x. Choose a rational . 
number z with y < z < 1/x. Then 1/z is in a, and 1/z > x. Thus 
a! does not contain a largest member. J 


LEMMA 


PROOF 


THEOREM 


PROOF 


THEOREM 


PROOF 


Construction of the Real Numbers 503 


In order to prove that a~ is really the multiplicative inverse of a, it helps 
to have another lemma, which is the multiplicative analogue of our first 
lemma. 


Let a be a real number with a > 0, and z a rational number with z > 1. 
Then there are rational numbers x in a, and y not in a, such that y/x = Ζ. 
Moreover, we can assume that y is not the least element of Q — a. 


Suppose first that z is in a. Since z — 1 > 0 and 
2> = (1+ (2 —1))* Ὁ 1 -Ἡ πίς — 1), 


it follows that the numbers 
ee ee. iedetre 


cannot all be in a. So there is some ἀ such that x = z* is in a, and y = 2**1 


is not in a. Clearly y/x = z. Moreover, if y happens to be the least element of 
Q — a, let x’ > x be an element of a, and replace x by x’ and y by yx! |x. 

If z is not in a, there is a similar proof, based on the fact that the numbers 
1/z* cannot all fail to be in a. J 


If α is a real number and a ~ 0, then a: a = 1. 


It obviously suffices to consider only a > 0, in which case a > 0. Suppose 
that x is a positive rational number in a, and y is a positive rational number 
in α΄. Then 1/y is not in a, so 1/y > x; consequently xy < 1, which means 
that xy is in 1. Since all rational numbers x < 0 are also in 1, this shows that 
every member of a - α΄ is in 1. 

To prove the converse assertion, let z be in 1. If z < 0, then clearly z is in 
a+aq!, Suppose 0 < z «1. According to the lemma, there are positive 
rational numbers x in a, and y not in @, such that y/x = 1/z; and we can 
assume that y is not the smallest element of Q — a. But this means that 
z = χ' (1/y), where x is in a, and 1/y is in a~*. Consequently, z is ina:-a. J 


We are almost done! Only the proof of the distributive law remains. Once 
again we must consider many cases, but do not despair. The case when all 
numbers are positive contains an interesting point, and the other cases can 
all be taken care of very neatly. 


If a, B, and Ὕ are real numbers, thena-> (8+ y) =a:B+a-y. 


Assume first that a, 8, y > 0. Then both numbers in the equation contain 
all rational numbers < 0. A positive rational number in a- (8 + 7) is of 
the form x: (y + 2) for positive x in a, y in B, and z in y. Since χ " (y + z) = 
x-y +x°z, where x-y is a positive element of a+ 8, and x-z isa positive 
element of a+ y, this number is also in a+ 8 + a: y. Thus, every element of 
a-(B+y)isalsoina:-B+a-y. 


504 Epilogue 


On the other hand, a positive rational number in a: 8 + a: ¥ is of the 
form x1‘ y + x. 2 for positive x1, x2 in a, yin 8, and z in y. If x1 < x2, then 
(x1/x2) *y <y, so (x1/x2) *y isin β. Thus 


xi yt χοῦ = χεί(χι χ2)} + 2] 


isin a: (8 + 7). Of course, the same trick works if x2 < αι. 

To complete the proof it is necessary to consider the cases when.a, 8, and Ὕ 
are not all > 0. If any one of the three equals 0, the proof is easy and the cases 
involving a < 0 can be derived immediately once all the possibilities for β 
and y have been accounted for. Thus we assume ἃ > 0 and consider three 
cases: 8, y <0, and 8 <0, y >0, and B >0, y «0. The first follows 
immediately from the case already proved, and the third follows from the 
second by interchanging 6 and γ. Therefore we concentrate on the case 
8 <0, y > 0. There are then two possibilities: 


(1) By 20. Then 
a-y=a-((B+tyl+ ls) ΞΞ α - (β - Ὑ) ta: (bl, 


50 
α- (B+ Ὑ) = -π-(α- |B) Ἔ α "ὙὝ 
=a: β΄ Ὁ α᾿Ὕ. 
(2) B+ > <0. Then 
a: [8] -- α΄" (BUY) Ξ α-|8 Ἔ Ὑ] Ἑ α: Ύ, 
50 


a (β -Ἐ Ὑ) =—(a- (+r) Ξ --Ἠᾳα - (8) Ἑα: ΎὝ Ξαιβο αὐ Ύ. ἢ 


This proof completes the work of the chapter. Although long and frequently 
tedious, this chapter contains results sufficiently important to be read in detail 
at least once (and preferably not more than once!). For the first time we know 
that we have not been operating in a vacuum—there is indeed a complete 
ordered field, the theorems of this book are not based on assumptions which 
can never be realized. One interesting and horrid possibility remains: there 
may be several complete ordered fields. If this is true, then the theorems of 
calculus are unexpectedly rich in content, but the properties P!1~P13 are 
disappointingly incomplete. The last chapter disposes of this possibility; 
properties P1-P13 completely characterize the real numbers—anything that 
can be proved about real numbers can be proved on the basis of these proper- 


ties alone. 


PROBLEMS 


There are only two problems in this set, but each asks for an entirely different 
construction of the real numbers! The detailed examination of another con- 
struction is recommended only for masochists, but the main idea behind these 
other constructions is worth knowing. The real numbers constructed in this 


Construction of the Real Numbers 505 


chapter might be called “the algebraist’s real numbers,” since they were 
purposely defined so as to guarantee the least upper bound property, which 
involves the ordering <, an algebraic notion. The real number system con- 
structed in the next problem might be called “με analyst’s real numbers,” 
since they are devised so that Cauchy sequences will always converge. 


1. 


Since every real number ought to be the limit of some Cauchy sequence 
of rational numbers, we might try to define a real number to be a Cauchy 
sequence of rational numbers. Since two Cauchy sequences might con- 
verge to the same real number, however, this proposal requires some 
modifications. 


(a) Define two Cauchy sequences of rational numbers {a, } and {b,} to be 
equivalent (denoted by {an} ~ {bn}) if lim (a, — bn) = 0. Prove that 


fan} ~ {an}, that {an} ~ {bn} if {bn} ~ {an}, and that {an} ~ {cn} 
if {an} ~ {b,} and {on} ~ {en}. 

Suppose that a is the set of all sequences equivalent to {a,}, and β is 
theset ofall sequences equivalent to {b,}. Prove that either a B = 9 
ora =. (Ifa β ¥ Q, then there is some ἔσῃ} in both a and 8. Show 
that in this case a and β both consist precisely of those sequences 
equivalent to {¢,}.) . 


(b 


we” 


Part (b) shows that the collection of all Cauchy sequences can be split 
up into disjoint sets, each set consisting of all sequences equivalent to 
some fixed sequence. We define a real number to be such a collection, 
and denote the set of all real numbers by R. 

(c) If α and β are real numbers, let {a,} be a sequence in a, and {bn} a 
sequence in β. Define a + β to be the collection of all sequences 
equivalent to the sequence {a, + b,}. Show that {an + bn} is a 
Cauchy sequence and also show that this definition does not depend 
on the particular sequences {a,} and {b,} chosen for a and 8. Check 
also that the analogous definition of multiplication is well defined. 

(d) Show that R isa field with these operations; existence of a multiplica- 
tive inverse is the only interesting point to check. 

(e) Define the positive real numbers P so that R will be an ordered field. 

(f) Prove that every Cauchy sequence of real numbers converges. 
Remember that if {a,,} is a sequence of real numbers, then each a, is 
itself a collection of Cauchy sequences of rational numbers. 


This problem outlines a construction of “the high-school student’s real 
numbers.” We define a real number to be a pair (a, {5,}), where a is an 
integer and {b,} is a sequence of natural numbers from 0 to 9, with the 
proviso that the sequence is not eventually 9; intuitively, this pair repre- 


sents a + Σ ὑ,10τπ, With this definition, a real number is a very con- 
n=l 


crete object, but the difficulties involved in defining addition and multi- 


506 Epilogue 


plication are formidable (how do you add infinite decimals without 
worrying about carrying digits infinitely far out?). A reasonable approach 
is outlined below; the trick is to use least upper bounds right from the 
start. 


(a) 


(b) 


Define (a, {dn}) « (¢, {d,}) if a « ς, or if a = and for some n we 
have ὁ, < d, but ὁ; = d; for 1 <j <n. Using this definition, prove 


the least upper bound property. 
k 


Given α = (a, {b,}), define a, = a+ > 6,10”; intuitively, a, is 
n=1 

the rational number obtained by changing all decimal places after 

the wth to 0. Conversely, given a rational number r of the form 


a+ y b,10~”, let r’ denote the real number (a, {4,’}), where 


bs Εἰ τ for 1 <n < kand by’ = 0 forn > k. Now for a = (a, [6,}) 


and 6 = (c, {d,}) define 


α Ἔ 8 = sup {(a, + @,)’: ka natural number} 


(the least upper bound exists by part (a)). If multiplication is defined 
similarly, then the verification of all conditions for a field is a straight- 
forward task, not highly recommended. Once more, however, 
existence of multiplicative inverses will be the hardest. 


CHAPTER 29 UNIQUENESS OF THE REAL NUMBERS 


We shall now revert to the usual notation for real numbers, reserving boldface 
symbols for other fields which may turn up. Moreover, we will regard integers 
and rational numbers as special kinds of real numbers, and forget about the 
specific way in which real numbers were defined. In this chapter we are 
interested in only one question: are there any complete ordered fields other 
than R? The answer to this question, if taken literally, is “yes.”” For example, 
the field F; introduced in Chapter 25 is a complete ordered field, and it is cer- 
tainly not R. This field is a “silly” example because the pair (a, a) can be 
regarded as just another name for the real number a; the operations 


(a, a) + (ὁ, δ) = (α - ὁ, « -᾿ 4), 
(a, a) " (ὁ, δ) = (a:b, α " ὃ), 


are consistent with this renaming. This sort of example shows that any 
intelligent consideration of the question requires some mathematical means 
of discussing such renaming procedures. 

If the elements of a field F are going to be used to rename elements of R, 
then for each a in R there should correspond a ‘‘name”’ f(a) in F. The notation 
f(a) suggests that renaming can be formulated in terms of functions. In order 
to do this we will need a concept of function much more general than any 
which has occurred until now; in fact, we will require the most general notion 
of “function” used in mathematics. A function, in this general sense, is simply 
a rule which assigns to some things, other things. To be formal, a function is a 
collection of ordered pairs (of objects of any sort) which does not contain two 
distinct pairs with the same first element. The domain of a function ἢ is the 
set A of all objects a such that (a, δ) is in f for some ὃ; this (unique) ὁ is denoted 
by f(a). If f(a) is in the set B for all a in A, then f is called a function from A 
to B. For example, 


if f(x) = sin x for all x in R (and f is defined only for x in R), then fisa 
function from R to R; it is also a function from R to [—1, 1]; 


if f(z) = sin z for all z in C, then f is a function from C to C; 


if f(z) = e? for all z in C, then f is a function from C to G; it is also a func- 
tion from C to {z in Ὁ: z ¥ 0}; 


θ is a function from {z in C: z γέ 0} to {x in R: 0 < x < 27}; 


if f is the collection of all pairs (a, (a, a)) for a in R, then fisa function 
from R to Fs. 


507 


508 


Epilogue 


Suppose that δὲ and δ are two fields; we will denote the operations in F; 
by Φ, ©, etc. and the operations in ἐς by +f, ", etc. If F: is going to be con- 
sidered as a collection of new names for elements of F;, then there should be a 
function from δὶ to δῷ with the following properties: 


(1) The function f should be one-one, that is, if « # y, then we should 
have f(x) # f(y); this means that no two elements of δὶ have the 
same name. 

(2) The function f should be “‘onto,” that is, for every element z in F, 
there should be some x in F such that z = f(x); this means that every 
element of F; is used to name some element of F}. 

(3) For all x and y in F,; we should have 


f(x ® y) = fx) + fy); 
f(x Oy) = fx) 10}; 


this means that the renaming procedure is consistent with the opera- 
tions of the field. 


If we are also considering F; and F, as ordered fields, we add one more 
requirement: 


(4) If x © y, then f(x) < f(y). 


A function with these properties is called an isomorphism from F; to Fs. 
This definition 15 so important that we restate it formally. 


DEFINITION | If #, and δὰ are two fields, an isomorphism from F; to F; is a function f 
from F, to F, with the following properties: 


(1) If x # y, then f(x) ¥ f(y). 
(2) If zis in Fe, then z = f(x) for some x in δ}. 
(3) If x and y are in Ff, then 


f(x ® y) = f(x) # FO), 
f(x © y) = fx) .10). 


If F, and 12 are ordered fields we also require: 


(4) If x © y, then f(x) < f(y). 


The fields Fi and δὰ are called isomorphic if there is an isomorphism | 
between them. Isomorphic fields may be regarded as essentially the same—any 
important property of one will automatically hold for the other. Therefore, 
we can, and should, reformulate the question asked at the beginning of the 
chapter; if F is a complete ordered field it is silly to expect F to equal R— 
rather, we would like to know if Fis isomorphic to R. In the following theorem, 
F will be a field, with operations -+ and -, and “‘positive elements’? P; we 
write a < ὁ to mean that ὦ — a is in P, and so forth. 


THEOREM 


PROOF 


Uniqueness of the Real Numbers 509 


If F is a complete ordered field, then F is isomorphic to R. 


Since two fields are defined to be isomorphic if there is an isomorphism 
between them, we must actually construct a function f from R to F which is an 
isomorphism. We begin by defining f on the integers as follows: 


t(0) = 0, 
[ἡ ΞῚ ἘΠ. +1 forn> 0, 
f(r) = —A+--+ #1) forn< 0. 


jn| times 
It is easy to check that 


f(m +n) = f(m) + f(r), 
f(m-n) = fim): f(r), 


for all integers m and n, and it is convenient to denote f(n) by n. We then 
define f on the rational numbers by 


f(m/n) = m/[n = m+n 


(notice that i+ --- +10 if n> 0, since F is an ordered field). This 
definition makes sense because if m/n = k/1, then ml = nk,som:l= kn, 
so m:n = k- 1". It is easy to check that 


f(rr -Ῥ τ) = f(r) # f(r), 
f(ri " τ) = f(ri) * 7), 


for all rational numbers ri and ro, and that f(r) < f(re) if τι < re. 

The definition of f(x) for arbitrary x is based on the now familiar idea that 
any real number is determined by the rational numbers less than it. For any 
x in R, let A, be the subset of F consisting of all f (r), for all rational numbers 
r < x. The set A, is certainly not empty, and it is also bounded above, for if 
ry is a rational number with ro > x, then f(ro) > f(r) for all f(r) in A;. Since F 
is a complete ordered field, the set A, has a least upper bound; we define f(x) 
as sup Az. 

We now have f(x) defined in two different ways, first for rational x, and 
then for any x. Before proceeding further, it is necessary to show that these two 
definitions agree for rational x. In other words, if x is a rational number, we 
want to show that 


sup 4, Ξ f(x), 


where f(x) here denotes m/n, for x = m/n. This is not automatic, but depends 
on the completeness of F; a slight digression is thus required. 
Since F is complete, the elements 
1+ τ +1. for natural numbers n 
lll 


n times 


510 Epilogue 


form a set which is not bounded above; the proof is exactly the same as the 
proof for R (Theorem 8-2). The consequences of this fact for R have exact 
analogues in F: in particular, if a and ὁ are elements of F with a < ὁ, then 
there is a rational number r such that 


a<fir) « ὁ. 


Having made this observation, we return to the proof that the two definitions 
of f(x) agree for rational x. If y is a rational number with y < x, then we have _ 
already seen that f(y) < f(x). Thus every element of 4, is < f(x). Consequently, 


sup A, < f(x). 
On the other hand, suppose that we had 
sup 4, < f(x). 
Then there would be a rational number 7 such that 
sup A, < f(r) < f(x). 


But the condition f(r) < f(x) means that r < x, which means that f(r) is in the 
set A,; this clearly contradicts the condition sup A, < f(r). This shows that 
the original assumption is false, so 


sup A, = f(x). 


We thus have a certain well-defined function f from R to F. In order to 
show that f is an isomorphism we must verify conditions (1)—(4) of the defini- 
tion. We will begin with (4). | 

If x and y are real numbers with x < y, then clearly A, is contained in Ay. 
Thus 


f(x) = sup Az S sup A, = f(y). 


To rule out the possibility of equality, notice that there are rational numbers 


rand s with 
x<r<as<y. 


We know that f(r) < f(s). It follows that 


70) Sf) < fs) Sf). 
This proves (4). 

Condition (1) follows immediately from (4): If x # y, then either x < y 
or y < x; in the first case f(x) < f(y), and in the second case f(y) < f(x); in 
either case f(x) = f(y). 

To prove (2), let a be an element of F, and let B be the set of all rational 
numbers r with f(r) < a. The set B is not empty, and it is also bounded above, 
because there is a rational number s with f(s) > a, so that f(s) > f(r) for r in 
B, which implies that s > r. Let x be the least upper bound of B; we claim 


Uniqueness of the Real Numbers 511 


that f(x) = a. In order to prove this it suffices to eliminate the alternatives 
f(x) <a, 
a < f(x). 
In the first case there would be a rational number γ with 


f(x) < f(r) <a. 
But this means that x <r and that r is in B, which contradicts the fact that 
x = sup B. In the second case there would be a rational number r with 

α « 70) < f(x). 
This implies that r < x. Since x = sup 8, this means thatr < sforsome sin B. 
Hence 


70) < f(s) <4, 


again a contradiction. Thus f(x) = a, proving (2). 
To check (3), let x and y be real numbers and suppose that f(x + y) γέ 
f(x) + f(y). Then either 


fe ty) < f(x) +f) or 70) fly) < fe +). 


In the first case there would be a rational number 7 such that 


fx ty) « flr) < f(x) +f). 
But this would mean that 
xty <r. 
Therefore r could be written as the sum of two rational numbers 


r=r4+re, where x <7 andy < re. 


Then, using the facts checked about f for rational numbers, it would follow that 


fr) =f(n +n) = fn) + f(r) > 7.) + £0), 


a contradiction. The other case is handled similarly. 
Finally, if x and y are positive real numbers, the same sort of reasoning shows 


that 
f(x-y) Ξ 10.) .10): 


the general case is then a simple consequence. J 


This theorem brings to an end our investigation of the real numbers, and 
resolves any doubts about them: There ἐς a complete ordered field and, up to 
isomorphism, only one complete ordered field. It is an important part 
of a mathematical education to follow a construction of the real numbers in 
detail, but it is not necessary to refer ever again to this particular construction. 
It is utterly irrelevant that a real number happens to be a collection of rational 
numbers, and such a fact should never enter the proof of any important 


512 Epilogue 


theorem about the real numbers. Reasonable proofs should use only the fact 
that the real numbers are a complete ordered field, because this property of 
the real numbers characterize them up to isomorphism, and any significant 
mathematical property of the real numbers will be true for all isomorphic 
fields. To be candid I should admit that this last assertion is just a prejudice 
of the author, but it is one shared by almost all other mathematicians. 


PROBLEMS 


1. 


ὥς 


Let f be an isomorphism from F; to F». 


(a) Show that f(0) = 0 and f(1) = 1. (Here 0 and 1 on the left denote 
elements in /;, while 0 and 1 on the right denote elements of 1.) 

(Ὁ) Show that f(—a) = —f(a) and f(a) = f(a), for a # 0. 

Here is an opportunity to convince yourself that any significant property 
of a field is shared by any field isomorphic to it. The point of this problem 
is to write out very formal proofs until you are certain that all statements 
of this sort are obvious. F and F, will be two fields which are isomorphic; 
for simplicity we will denote the operations in both by ++ and -. Show that: 


(a) If the equation x? -- 1 = 0 has a solution in Εἰ, then it has a solution 
in Fy, 


(Ὁ) If every polynomial equation x” + a,1°x" ἘΠ᾿ τ -a= 
with do, . . .ὄ ,@n—1 in Δὲ has a root in δὶ, then every polynomial 
equation x” = b,_1° x" ἘΠ... by = O with do, . . . , bn_1 in 
F, has a root in Fy». 

(c) 11 Ἔ ἘΠΊ (summed ™ times) = 0 in Jj, then the same is true 
in Fy. 


(d) If δὶ and δ are ordered fields (and the isomorphism f satisfies 
f(x) < f(y) for x < y) and Εἰ is complete, then F2 is complete. 


Let f be an isomorphism from F; to ἐς and g an isomorphism from /2 to 
F;. Define the function g of from F; to F3 by (go f)(x) = g(f(x)). Show 
that g eof is an isomorphism. 

Suppose that F is a complete ordered field, so that there is an isomorphism 
f from R to F. Show that there is actually only one isomorphism from R 
to F. Hint: In case F = R, this is Problem 3-17. Now if f and g are two 
isomorphisms from R to / consider σ᾿ ὁ ἢ. 

Find an isomorphism from C to C other than the identity function. 


SUGGESTED 
READING 


A man ought to read 

just as inclination leads him ; 
for what he reads as a task 
will do him little good. 


SAMUEL JOHNSON 


One purpose of this bibliography is to guide the reader to other 
sources, but the most important function it can serve is to indicate the variety 
of mathematical reading available. Consequently, there is an attempt to 
achieve diversity, but no pretense of being complete. The present plethora of 
mathematics books would make such an undertaking almost hopeless in any 
case, and since I have tried to encourage independent reading, the more 
standard a text, the less likely it is to appear here. In some cases, this philos- 
ophy may seem to have been carried to extremes, as some entries in the list 
cannot be read by a student just finishing a first course of calculus until several 
years have elapsed. Nevertheless, there are many selections which can be read 
now, and I can’t believe that it hurts to have some idea of what lies ahead. 

One of the most elementary unproved theorems mentioned in this book is 
the fact that every natural number can be written as a product of primes in 
only one way. A proof of this basic theorem will be found near the beginning of 
almost any book on elementary number theory. Few books have won so 
enthusiastic an audience as 


[1] An Introduction to the Theory of Numbers (third edition), by G. H. Hardy 
and E. M. Wright; Clarendon Press, Oxford, 1960. 


The Pergamon Press publishes a series, Popular Lectures in Mathematics, 
with several worthwhile titles, among them 


[2] A Selection of Problems in the Theory of Numbers, by W. Sierpinski; Mac- 
millan (Pergamon), New York, 1964. 


Finally, I will mention a little book which I hope is still in print: 


[3] Three Pearls of Number Theory, by A. Khinchin; Graylock Press, 
Rochester, N.Y., 1952. 


The subject of irrational numbers straddles the fields of number theory and 
analysis. An excellent introduction will be found in 


[4] Irrational Numbers, by I. M. Niven; Wiley, New York, 1956. 


Together with many historical notes, there are references to some fairly 
elementary articles in journals. There is also a proof that 7 is transcendental 
(see also [53]) and, finally, a proof of the “‘Gelfond-Schneider theorem”: 
If a and 6 are algebraic, with a σέ 0 or 1, and ὁ is irrational, then a” is 
transcendental. | 

All the books listed so far begin with natural numbers, but whenever neces- 
sary take for granted the irrational numbers, not to mention the integers and 
rational numbers. Several recent books present a construction of the rational 
numbers from the natural numbers, but it is hard to believe that there can 
be a more lucid treatment than the one to be found in 


[5] Foundations of Analysis, by E. Landau; Chelsea, New York, 1951. 
Incidentally, the original German edition, 


[6] Grundlagen der Analysis (fourth edition), by E. Landau; Chelsea, New 
York, 1965. 


is now available in paper back, together with a complete German-English 
dictignary (of about 300 words) for the whole book—an excellent way to begin 


516 Suggested Reading 


reading mathematical German. The basic idea for constructing the real 
numbers is derived from Dedekind, whose contributions can be found in 


[7] Essays on the Theory of Numbers, by R. Dedekind; Dover, New York, 
1963. | 


While many mathematicians are content to accept the natural numbers 
as a natural starting point, numbers can be defined in terms of sets, the most 
basic starting point of all. A charming exposition of set theory can be 
found in a sophisticated little book called 


[8] Naive Set Theory, by P. R. Halmos; Van Nostrand, Princeton, N.J., 
1961. 


Another very good introduction is 
[9] Theory of Sets, by E. Kamke; Dover, New York, 1950. 


Perhaps it is necessary to assure some victims of the “new math” that set 
theory does have some mathematical content (in fact, some very deep 
theorems). Using these deep results, Kamke proves that there is a discon- 
tinuous function f such that f(x + y) = f(x) + f(y) for all x and y. For those 
who enjoy reading the classics, the most important notions of set theory were 
first introduced by Cantor, whose work is reproduced in 


[10] Contributions to the Founding of the Theory of Transfinite Numbers, by G. 
Cantor; Dover, New York, 1952. 


Inequalities, which were treated as an elementary topic in Chapters 1 and 2, 
actually form a specialized field. A good elementary introduction is provided 
by | 
[11] Analytic Inequalities, by N. Kazarinoff; Holt, Rinehart and Winston, 
New York, 1961. 
Twelve different proofs that the geometric mean is less than or equal to the 


arithmetic mean, each based on a different principle, can be found in the 
beginning of the more advanced book 


[12] Inequalities, by E. Beckenbach and R. Bellman; Springer, New York, 
1961. 


The classic work on inequalities is 


[13] Inequalities (second edition), by G. H. Hardy, J. E. Littlewood, and 
G. Polya; Cambridge University Press, New York, 1952. 


Each of the authors of this triple collaboration has provided his own con- 
tribution to the sparse literature about the nature of mathematical thinking, 
written from a mathematician’s point of view. My favorite is 


[14] A Mathematician’s Apology, by G. H. Hardy; Cambridge University 
Press, New York, 1940. 


Suggested Reading 517 


Littlewood’s anecdotal selections are entitled 
[15] A Mathematician’s Miscellany, by J. E. Littlewood; Methuen, 1953. 
Polya’s contribution is pedagogy at the highest level: 


[16] Mathematics and Plausible Reasoning, by G. Polya (Vol. I: Induction and 
Analogy in Mathematics; Vol. II: Patterns of Plausible Inference); Princeton 
University Press, Princeton, N.J., 1954. 


Geometry is the other main field which can be considered as background 
for calculus. Euclid’s Elements is still a masterful mathematical work, but 
should perhaps be postponed until some preparation has been made. Probably 
the best modern work on “‘classical geometry”’ is 


[17] Elementary Geometry from an Advanced Standpoint, by E. Moise; Addison- 
Wesley, Reading, Mass., 1963. 


This beautiful book provides excellent historical perspectives and contains a 
thorough discussion of the role of the “Archimedean axiom” in geometry; in 
addition, Chapter 28 describes an ordered field in which the Archimedean 
axiom does not hold. Speaking of beautiful geometry books, all sorts of 
fascinating things can be found in 


[18] Introduction to Geometry, by H. S. Coxeter; Wiley, New York, 1961. 


Almost all treatments of geometry at least mention convexity, which forms 
another specialized topic. I cannot imagine a better introduction to convexity, 
or a better mathematical experience in general, than reading and working 
through 


[19] Convex Figures, by I. M. Yaglom and W. G. Boltyanskii; Holt, Rinehart 
and Winston, New York, 1961. 


This book contains a carefully arranged sequence of definitions and statements 
of theorems, whose proofs are to be supplied by the reader (worked-out proofs 
are supplied in the back of the book). Another geometry book has been 
modeled on the same principle: 


[20] Combinatorial Geometry in the Plane, by H. Hadwiger and H. Debrunner; 
Holt, Rinehart and Winston, New York, 1964. 


Along with these two out-of-the-ordinary books, I might mention an extremely 
valuable little book, also of a specialized sort, 


[21] Counterexamples in Analysis, by B. Gelbaum and J. Olmsted; Holden- 
Day, San Francisco, 1964. 


Many of the examples in this book come from more advanced topics in 
analysis, but quite a few can be appreciated by someone who knows calculus. 


518 Suggested Reading 


Of calculus books I will mention only two, each something of a classic: 


[22] A Course of Pure Mathematics (tenth edition), by G. H. Hardy; Cam- 
bridge University Press, New York, 1952. 


[23] Differential and Integral Calculus, by R. Courant; Vol. I, second edition, 
1937; Vol. II, first edition, Wiley (Interscience), New York, 1936. 


Courant is especially strong on applications. In addition, the latter parts of 


Volume I contain material usually found in advanced calculus, including 
differential equations and Fourier series. An introduction to Fourier series 
(requiring a little advanced calculus) will also be found in 


[24] An Introduction to Fourier Series and Integrals, by R. Seeley; W. A. Ben- 
jamin, New York, 1966. 


The second volume of Courant (advanced calculus in earnest) contains addi- 
tional material on differential equations, as well as an introduction to the 
calculus of variations. I will not mention any books devoted solely to the 
calculus of variations, since they are (of necessity) quite difficult. Innumerable 
books on differential equations have been written, but none of the elementary 
ones seems to generate much enthusiasm. There is a somewhat more advanced 
book, however, which is universally admired: 


[25] Lectures on Ordinary Differential Equations, by ΝΥ. Hurewicz; M.I.T. 
Press, Cambridge, Mass., 1958. 


I will bypass the more or less standard advanced calculus books (which can 
easily be found by the reader himself) since nowadays there is a move- 
ment to revise the whole presentation of advanced calculus, basing it upon 
linear algebra. One of the first, and still one of the nicest, treatments of 
advanced calculus using linear algebra is 


[26] Calculus of Vector Functions, by R. H. Crowell and R. E. Williamson; 
Prentice-Hall, Englewood Cliffs, N.J., 1962. 


Several recent books on advanced calculus attempt to acquaint under- 
graduates with very large areas of modern mathematics. My favorite, of 
course, 15 


[27] Calculus on Manifolds, by M. Spivak; W. A. Benjamin, New York, 1965. 


There are three other topics which are somewhat out of place in this 
bibliography because they are rapidly becoming established as part of a 
standard undergraduate curriculum. The purposeful study of fields and 
related systems is the domain of ‘‘algebra.”’ The first undergraduate text on 
modern algebra was 


[28] A Survey of Modern Algebra (third edition), by G. Birkhoff and S. 
MacLane; Macmillan, New York, 1941. 


Suggested Reading 519 


One of the strongest competitors is now 


[29] Topics in Algebra, by I. N. Herstein; Ginn (Blaisdell), Boston, Mass., 
1963. 


Herstein’s book is meant to be intermediate in difficulty between A Survey 
of Modern Algebra and one of the great classics: 


[30] Modern Algebra, by B. L. van der Waerden; Ungar, New York, 1953. 


By the way, this book contains a proof of the partial fraction decomposition 
of a rational function. 
One of the standard texts in complex analysis is 


[31] Complex Analysis (second edition), by L. V. Ahlfors; McGraw-Hill, 
New York, 1966. 


I am personally looking forward to the publication of 


[32] Lectures on the Theory of Functions of a Complex Variable, by G. W. Mackey; 
Van Nostrand, Princeton, N.J., 1967. 


Finally, the study of topology, although never mentioned before, has really 
been in the background of many discussions, since it is the natural generaliza- 
tion of the ideas about limits and continuity which play such a prominent role 
in Part II of this book. There are now many elementary books on topology, 
but I am dismayed by the prospect of reading and evaluating them all, so I 
will merely call attention to the fact that they exist, and warn that some are 
not very good. 

The next few topics, ranging from elementary to very difficult, are included 
in this bibliography because they have been alluded to in the text. The proof 
that a nondecreasing function is differentiable at almost all points (and an 
explanation of just what this means) receives a beautiful exposition in 


[33] Functional Analysis, by F. Riesz and B. Sz.-Nagy; Ungar, New York, 
1955. 


(After this elementary beginning, the book moves on to quite advanced 
material.) The gamma function has an elegant little book devoted entirely to 
its properties, most of them proved by using the theorem of Bohr and Mollerup 
which was mentioned in Problem 18-25: 


[34] The Gamma Function, by E. Artin; Holt, Rinehart and Winston, New 
York, 1964. 

The gamma function is only one of several important improper integrals in 

mathematics. In particular, the calculation‘of ! ” στ' dx (see Problem 18-27) 

is important in probability theory, where the “normal distribution function” 


P(x) = = [. ei" dy 


520 Suggested Reading 


plays a fundamental role. The following book has already become something 
of a classic: 


[35] An Introduction to’ Probability Theory and Its Applications (second edition), 
by W. Feller; Wiley, New York, 1957. 


The impossibility of integrating certain functions in elementary terms (among 
them f(x) = ε΄ 5) is one of the most esoteric subjects in mathematics. An 
interesting discussion of the possibilities of integrating in elementary terms, 
with an outline of the impossibility proofs, and references to the original 
papers of Liouville, will be found in 


[36] The Integration of Functions of a Single Variable (second edition), by 
G. H. Hardy; Cambridge University Press, New York, 1958. 


A complete presentation of the impossibility proofs will be found in 


[37] Integration in Finite Terms, by J. Ritt; Columbia University Press, New 
York, 1948. 


Oddly enough, a related but seemingly more difficult problem has a much 
neater solution. There are simple differential equations (γ᾽ + xy = 0 is a 
specific example) whose solutions cannot be expressed even in terms of 
indefinite integrals of elementary functions. This fact is proved on page 43 
of the (60-page) book: 


[38] An Introduction to Differential Algebra, by I. Kaplansky; Hermann, Paris, 
1957. 


To read this book you will need to know quite a bit of algebra, however. 
(Just recently, equally algebraic and polished treatments of Liouville’s 
theorems have been obtained, and presumably they will be published 
soon.) 

A few words should also be said in defense of the process of integrating in 
elementary terms, which many mathematicians.look upon as an art (unlike 
differentiation, which is merely a skill). You are probably already aware that 
the process of integration can be expedited by tables of indefinite integrals. 
For those who enjoy perusing tables there is a really beautiful collection, that 
includes indefinite integrals, definite improper integrals, and a great deal 
more besides (if you should ever happen to need the value of the thirty-fourth 
Bernoulli number, this is the place to look): 


[39] Tables of Integrals, Series, and Products, by I. S. Gradschteyn and I. W. 
Ryzhik; Academic Press, New York, 1965. 


For the thrifty, there is a paperback table of integrals: 
[40] Tables of Indefinite Integrals, by G. Petit Bois; Dover, New York, 1961. 


Suggested Reading 521 


The remaining references are of a somewhat different sort. ‘They fall into 
three categories, of which the first is historical. The letter of H. A. Schwartz 
referred to in Problem 11-49 will be found in 


[41] Ways of Thought of Great Mathematicians, by H. Meschkowski; Holden- 
Day, San Francisco, 1964. 


Some historical remarks, and an attempt to incorporate them into the teaching 
of calculus, will be found in 


[42] The Calculus: A Genetic Approach, by O. Toeplitz; University of 
_ Chicago Press, Chicago, IIl., 1963. 


Most textbooks on the history of mathematics are both superficial and dull. 
An admirable exception is 


[43] An Introduction to the History of Mathematics, by H. Eves; Holt, Rinehart 
and Winston, New York, 1964. 


Three good scholarly works are 


[44] History of Analytic Geometry, by C. Boyer; Academic Press, New York, 
1965. 

[45] A History of the Calculus, and Its Conceptual Development, by C. Boyer; 
Dover, New York, 1959. 

[46] The Mathematics of Great Amateurs, by J. Coolidge; Dover, New York, 
1963. 


A recent book presents philosophical ruminations on the history of mathe- 
matics by an author whose exceptional erudition is all the more impressive 
in a first-rate mathematician: 


[47] The Role of Mathematics in the Rise of Science, by S. Bochner; Princeton 
University Press, Princeton, N.J., 1966. 


Finally, extracts from original sources will be found in 


[48] A Source Book in Mathematics (2 vols.), by D. Smith; Dover, New York, 
1959. 


The large number of books listed here might easily give too hopeful a view of 
the availability of historical information. Although the very early origins of 
mathematics are receiving the most careful scrutiny, no one seems to Care very 
much about the origins of calculus. For example, it is almost impossible to find 
out who first proved the Mean Value Theorem (according to the Encyklopadie 
der Mathematischen Wissenschaften, Volume II, it was O. Bonnet, whose name 
is familiar to students of differential geometry from the ‘‘Gauss-Bonnet 
Theorem’’). Similarly, nearly every history book tells us that Wallis proved 
Wallis’s formula by a ‘“‘complicated method of interpolation,” but almost 
none bothers to mention what it was, even though it inspired Euler’s investiga- 


522 Suggested Reading 


tions of the gamma function (a description is given in the answer book, along 
with the solution to Problem 18-26). 

The second category in this final group of books might be described as 
‘“popularizations.” The New York Times editorials notwithstanding, there are 
a surprisingly large number of first-rate ones by real mathematicians: 


[49] What is Mathematics? (fourth edition), by R. Courant and H. Robbins; 
Oxford University Press, New York, 1947. . 

[50] Geometry and the Imagination, by D. Hilbert and S. Cohn-Vossen. 
Chelsea, New York, 1956. 

[51] The Enjoyment of Mathematics, by H. Rademacher and O. Toeplitz; 
Princeton University Press, Princeton, N.J., 1957. 

[52] Famous Problems of Mathematics (second edition), by H. Tietze; Gray- 
lock Press, Rochester, N.Y., 1965. 


One of the most renowned “‘popularizations”’ is especially concerned with the 
teaching of mathematics: 


[53] Elementary Mathematics from an Advanced Standpoint, by F. Klein (vol. 1: 
Arithmetic, Algebra, Analysis; vol. 2: Geometry); Dover, New York, 1948. 


Volume 1 contains a proof of the transcendence of a which, although not so 
elementary as the one in [4], is a direct analogue of the proof that ¢ is trans- 
cendental, replacing integrals with complex line integrals. It can be read as 
soon as the basic facts about complex analysis are known. 

The third category is the very opposite extreme—original papers. ‘The 
difficulties encountered here are formidable, and I have only had the courage 
to list one such paper, the source of the quotation for Part IV. It is not even 
in English, although you do have a choice of foreign languages. The article 
in the original French is in 


[54] Oeuvres Completes a’ Abel; Christiania. Johnson Reprint Corporation, 
New York, 1965. 


It first appeared in a German translation in the Journal fiir die reine und 
angewandte Mathematik (1) 1826. To compound the difficulties, these references 
will usually be available only in university libraries. Yet the study of this 
paper will probably be as valuable as any other reading mentioned here. ‘The 
reason is suggested by a remark of Abel himself, who attributed his profound 
knowledge of mathematics to the fact that he read the masters, rather than 
the pupils. 


ANSWERS 
TO SELECTED 
PROBLEMS 


CHAPTER 1 


10. 


11. 


12. 


525 


(i) 
(iii) 


(vi) 


1=a0g =a Max) = (dq a)x = 1x =x. 

Ifx? = y%, then0 = x? —y? = (x — y)@ + y),soeitherx — y = 0 
or x + y = 0, that is, either x = —y or x = }. 

Replace y by —y in (iv). 


One step requires dividing by x — y = 0. 


(i) 
(it) 


(iii) 
(v) 


(i) 
(ili) 
(v) 
(vii) 
(ix) 
(xi) 
(xiii) 


(i) 


(iii) 
(v) 


(vii) 
(ix) 
(i) 
(iii) 
(v) 
(i) 


(iii) 
(i) 
(iii) 
(v) 


(vii) 


(i) 
(iii) 


a/b = ab“ = (αο) (δ 1.1) = (ac)(bc)~! (by (iii)) = ac/be. 

(ad + bc)/(bd) = (ad + bc) (bd) * = (ad + δε) (δ 141) (by (iii) 
= αὖ 1 + cd! = a/b + cfd. 

αδ(α ᾽ν 1) = (a: α΄ )(ὁ - δ ἢ = 1, so α΄. δ} = (αὐ) 1. 

(a/b) ,(ε, 4) = (a/b)(c/d)* = (α - Bed ty 

= (α᾿ δ Ὁ) (1: 4) = ad(b ο1) = ad(bc)~* 

= (ad)/(bc). 

x< li. 

x>VTorxn< -- V7, 

All x, since x? — 2x +2 = (« - 1)? + 1. 

x>3orx < —2, since 3 and —2 are the roots of x? — x — 6 = 0. 
x>mwor —5<x«< 3. 

1 ee a 

x>1lor0<*<il. 

h —a and d—c are in P, so (ὁ -- α) + (ὦ -- ὃ) = (ὁ +d) — 
(a +c) is in P. Thus, b+d>arte. 

Using (ii), —¢ « —d; then (i) implies that a + (—¢c) « ὁ + (—d). 
(ὁ — a) and —c are in P, so —c(b — a) = ac — bc isin P, that is, 
ac > be. 

Using (iv), a > 0 and a < 1, so a® <a. 

Substitute a for c and ὁ for d in (viii). 

V24+V3—-V5 4+ V7. 

ab) ele) Jatotel. 

V24V34V5 — V7. 

aifa>—bandb>Q0; 

—aifa< —bandb <0; 

a+ 2b ἴα > —bandb <0; 

—q — 2b ἴα < —b and ὁ > 0. 

x— x if x > 0; 

—x —xifx <0. 

x= 11. -- Ὁ. 

--ὁ «χ « --2. 

No x (the distance from x to 1 plus the distance from x to —1 15 at 
least 2). 

x=1, —1. 

(|xyl)2 = (ay)? = x%y? = α| 1" = (ΑἹ + [yl)%s since |xy| and [al " ly| 
are both > 0, this proves that |xy| = |x| " |y/. 

lel /ly| = [xl «byt? = [el yl (by Gi) = lay] (by @) = [χ 9]. 


526 Answers to Selected Problems 


(v) 


(vii) 


CHAPTER 2 1. 


() 


1: - Ὁ... 


2. (i) 


n 
1=1 


(27 — 1) 


It follows from (iv) that |x| = ly — (y — x)| < |y| 4+ ly — x], so 
Δ les eo, 

ln ty +2] < [x + y| + [2] < {χ| + ly| + [2]. If equality holds, 
then [x + γ᾽ = |x| + ly, sox and y have the same sign. Moreover, 
z must have the same sign as x + y, so x, y, and z must all have the | 


same sign (unless one is 0). 


Since 1? (2): (2-1-+ 1)/6, the formula is true for n = 1. 
Suppose ΠΝ the formula is true for k. Then 


K(k + 1)(2k + 1) 
6 


[k(2k + 1) + 6(k + 1)] 


+ + (k + 1)? = + (k+1) 


-ὁτ 


[(( + 2)(2κ + 3)] 


_ (K+ AK + 22k + 1) + 1) 
6 


aay 
6 


so the formula is true for & + 1. 


I 


Lo a ie aD) 


1-2-|3-Ὁ τ 2η -- 24  - τ᾿ 


= VE ain + 1) 


+ n) 


= n?, 


4. (a) Since 


1+r= 


the formula is true for n = 1. Suppose that 


{ sey. Se 


5; 


11. 


Answers to Selected Problems 527 


(b) 
SH=itrt:::4+r 
rs = γ τ pe yh fh Ph 
Thus 
S4—r) =S—r =1 — 7, 
50 
Cicer νοι 
1 -- γ 
(i) From 
(A+ 1) -- k= 4684+ 642 +46 +1, F=1,...,0 


we obtain 
(ι -᾿ 1) -- 1 Ξ 4 κλ τ᾿ 6 k2 + 4 k+n, 
Σ Py Σ 


sO 


_ gna + 1) (2n + 1) _ nla +4) Σὰ 


»y (n+1)*—-1 : 5 
ks = 
4 


k=1 
nt n> δὲ 
erie eae 
Gii) From 
π᾿ ee ᾿ 
:-. FET FELD 
we obtain 


1-——-) 7 
n+1 Lykke +1) 


1 is either even or odd, in fact it is odd. Suppose n is either even or odd; 
then n can be written either as 24 or 2k + 1. In the first casen + 1 = 
2k + 1 is odd; in the second casen+1 = 2k +141 = 24+ 1) 1s 
even. In either case, n + 1 is either even or odd. (Admittedly, this looks 
fishy, but it is really correct.) _ 

Let B be the set of all natural numbers / such that πὸ — 1 +7 is in A. 

Then 1 is in B, and / + 1 is in B if / is in B, so B contains all natural 

numbers, which means that A contains all natural numbers 2 no. 

(a) Yes, for if a+ ὁ were rational, then ὁ = (a + δ) — a would be 
rational. If a and ὁ are irrational, then a + ὁ could be rational, for b 
could be r — a for some rational number a. 

(Ὁ) If a = 0, then ab is rational. But if a κέ 0, then ab could not be 
rational, for then ὁ = (ab) - a~! would be rational. 


528 


Answers to Selected Problems 


(c) Yes; for example, Vv 2. 7 
(d) Yes; for example, V2 and — V2. 
12. (a) Since 


(3n + 1)? = 9n? + 6n - 1 = 3(3n? + 2n) +1, 
(3n + 2)? = 9n? + 12n + 4 = 3(3n? + 4n + 1) - 1, 


it follows that if k? is divisible by 3, then ἃ must also be divisible by 3. 
Now suppose that V3 were rational, and let V3 = p/q where p and 
4 have no common factor. Then p? = 345, so p? is divisible by 3, so 
p must be. Thus, p = 3p’ for some natural number p’, and conse- 
quently (3p’)? = 342, or 3(p’)? = q?. Thus, q is also divisible by 3, a 
contradiction. 

The same proofs work for V5 and V6, because the equations 


Ι; 


(5n + 1)? = 25n? + 10η + 1 = 5(5n? + 2n) + 1, 
(5n + 2)? = 25n? + 20n + 4 = 5(5n? + 4n) + 4, 
(5n + 3)? = 25n? + 30n - 9 = 5(5n? + On + 1) + 4, 


(5n + 4)2 = 25n? + 40n + 16 = 5(5η2 + 8n + 3) +1, 


I 


and the corresponding equations for numbers of the form 6n + m, 
show that if k? is divisible by 5 or 6, then ἀ must be. The proof fails 
for V 4, because (4n + 2)? is divisible by 4. (For precisely this reason 
this proof cannot be used to show that in general V a is irrational if 
a is not a perfect square—we have no guarantee that (an + πὶ)" might 
not be a multiple of a for some m < a. Actually, this assertion zs true, 
but the proof requires the information in Problem 16.) 
(b) Since 


(2n + 1)? = 8n? + 12n? + 6n - 1 = 2(4n3 + 6n? + 3n) + 1, 


it follows that if 13 is even, then ὦ is even. If V2 = b/q where p and 
4 had no common factors, then p* = 245, so #? is divisible by 2, so p 
must be. Thus, p = 26’ for some natural number ’, and conse- 
quently (2’)* = 29%, or 4(p’)* = φ'. Thus, 4 is also even, a contra- 
diction. 

The proof for V3 is similar, using the equations 


(ὅπ + 1)3 = 27η3 + 27n? + 9η - 1 = 3(9n? + On? + 3n) - 1, 
(3n + 2)8 = 27n® + 54n? + 36n + 8 = 3(9n? + 18n? + 12n + 2) +2. 


18. If n=1, then (1 +A)” = 1+ nh. Suppose that (1 +h)” > 1 -Ἡ nh. 
Then 


(1 +aryita">1+A)1 4+ nh), since1 +h > 0 
1+ (n+ 1)h+ nh? >14+ (n+ 1A. 


(1 + Arti 


CHAPTER 3 


11. 


12; 


Answers to Selected Problems 529 


For ᾧ > 0, the inequality follows directly from the binomial theorem, 
since all the other terms appearing in the expansion of (1 + h)” are 
positive. 


(i) (x + 1)/(x + 2); the expression f(/(x)) makes sense only when 
x ρέ --Ἰ δῃὰ χα τέ —2. 

(iii) 1/(1 + cx) (for x # —1/c ifc #0). 

(vy) («ty +2)/@4+ 1)ὴ0 +1) (for x, y ¥ —1). 

(vii) Only c = 1, since f(x) = f(cx) implies that x = cx, and this must 
be true for at least one x ¥ 0. 

(i) y > Ὁ and rational, or y 2 1. 

(iii) 0. 

(v) —1, 0, 1. 

(i) {x: -1 <x <1}. 

(iii) {x: χα ¥1 and x # 2}. 


(ν) Ὁ. 

(i) 22, 

(iii) 228m t + sin (24). 

(i) Pos. 

(iii) so ὁ, 

(v) PoP. 

(vii) sososoPoPoPos. 
(a) y. 

(b) HQ). 

(c) 10). 

(a) even odd 


even even neither 


odd neither odd 


(b) even odd 
even 
odd | 

(c) feven  f odd 
g even 
g odd 


(d) Let g(x) = f(x) for x > 0 and define g arbitrarily for x < 0. 


530 


Answers to Selected Problems 


CHAPTER 4 


21. 


10. 


(i) Let g(x) = h(x) = 1 and let f be a function for which f(2) κέ 
11) + fl). Then fo (g +h) # fog - fok. 

Gi) [(g +A) 9.70) = (g + δ, )})) = (f(x) + σῷ) = 
(go ἢ) + - coe =[(gef) + (ho f) (x). 

oe 1 

fii) ὦ = Hy = ΤΩ) = Gee) ὦ. 

(iv) Let g(x) = 2 and let f be a function for which f($) ¥ 1/f(2). 
Then 1/(fe 5) # fo (1/g). 


Gy). (, 4). 
(iii) [2, 4]. 
(vw): (2,2). 


(vii) (—*,1]U[1, @). 

(i) All points below the graph of f(x) = x. 

(iii) All points below the graph of f(x) = x?. 

(v) All points between the graphs of f(x) = x + 1 and f(x) ==" 1: 

(vii) A collection of straight lines parallel to the graph of f(x) = —x, 

intersecting the horizontal axis at the points (n, 0) for integers n. 

(ix) All points inside the circle of radius 1 around (1, 2). 

(i) A square with vertices (1, 0), (0, 1), (—1, 0), and (0, —1). 

(iii) The union of the graph of f(x) = x and of f(x) = 2 — x. 

(v) The point (0, 0). 

(vii) The circle of radius /5 around (1, 0), since x? — 2x + y? = 

(x — 1)? + y? — 1. 

(a) Simply observe that the graph of f(x) = m(x — a) + ὁ = me + 
(ὁ — ma) is a straight line with slope γι, which goes through the 
point (a, 6). (The important point about this exercise is simply to 
remember the point slope form.) 

(Ὁ) The straight line through (a, ὁ) and (c, d) has Pe (d — b)/(c — a), 
so the equation follows from part (a). 

(c) When m = m' and ὁ # δ΄. In that case, there is clearly no number x 
with f(x) = g(x), while such a number x always exists if m γέ m’, 
namely, x = (b’ — b)/(m — m’). 

(a) If B = Oand A = 0, then the set is the vertical straight line formed 
by all points (x, y) with x = —C/A. If B ¥ 0, the set is the graph of 
f(x) = (—A/B)x + (—C/A). 

(b) The points (x, y) on the vertical line with x = a are precisely the 
ones which satisfy 1-x - 0-y-+ (—a) = 0. The points (x, y) on 
the graph of f(x) = mx ἡ b are precisely the ones which satisfy 
(—m)x +1-y+(-d) = 

(1) The graph of f is eke with respect to the vertical axis. 

(ii) The graph of f is symmetric with respect to the origin. Equiva- 

lently, the part of the graph to the left of the vertical axis is 


CHAPTER 5 


Answers to Selected Problems 531 


obtained by reflecting first through the vertical axis, and then 
through the horizontal axis. 
(iii) The graph of f lies above or on the horizontal axis. 
(iv) The graph of f repeats the part between 0 and a over and over. 
19. (a) The square of the distance from (x, x?) to (0, 4) is 


1\2 2 1 
a? + (x? - 2) ge 
x? 1 
= χά ne rea 
x Drag te ae 
=@ +4 


which is the square of the distance from (x, x”) to the graph of g. 
(b) The point (x, y) satisfies this condition if and only if | 


τ) τ ΞΡ) Sy) 


or 
x* -- Zax - a? + y? — 2By -᾿ β5 = y? — 2γγ + 7, 


or 


1 2 2. le 
y= (..-. x2 + (—*_) x + (Se) 
28 — 2γ γ-β 2β -- 2γ 
(This solution works only for 8 # Ύ, which is just the condition that 
P is not on L. If P is on L, then the solution is the vertical line 


through P.) 
" . x — 8 ; 
1. (i) lim = lim (x? + 2x + 4) = 12. 
zmo2x— 2 x->2 
(ν) lim 5 - Jim x" $y Ho Pay? $y 
zy k= y my 
a yt pyr tt... Ἔγ5: = ny"), 
᾿ oo Vane a (Va +h—Va(Vath+ Va) 
(vi) lim So πε πα lim -------------.---ΞΞἡὁἑ- --εΞ---ο.---- - ξΞὄΞἝἝ-- -- ------- 
h—0 h hA-0 AV a + h + Va) 
1 1 
= lim 


moVathtVa 2Va 


- 2. (i) It is possible to find 6 by beginning with the equation 


x4 — αὐ = (x — a)(x* + ax? + αὖχ + αὖ). 
If |x — αἰ < 1, then |x| < 1 + Jal, so 


jx? + ax? + αὖχ + a4] < [χ] + lal - |x|? + [α{3- [x] + lal? 
« (1 + Jal)? + ἰα](4 + Jal)? + Ια] "(1 + al) + Ια]: 


532 Answers to Selected Problems 


therefore we can choose 


ξ 
5 ὩΣ in an Re |e 
μὰ ( (1 + al) + Jal(1 + fal)? + Jal2(1 + fal) + a) 


It is instructive, and probably easier, to use part (2) of the lemma. 
This shows that |x* — αἴ < ¢ when 


2 72 ξ , 
Ix? — αἱ < min (1, Del? ἢ 5) 
which is true when 
min (1, ee) 
2(la|? + 1) 
21 ὰ + 1) 


lx — αἱ < min \ 1, 


: ξ 
= min (1, wean) - ὃ 
4(|a|? + 1)([α + 1) 
(ii) By part (3) of the lemma, |1/x — 1| < € when 


Ix — 1 < min (5 5) = ὃ. 
2.2 


(iii) By part (1) of the lemma, |(x4 + 1/x) — 2| < ἐννῃεῃ |1/x — 1| < 
é/2 and |x* — 1| < ¢/2. According to parts (i) and (ii) of this 
problem, this happens when 
jx — i] < min (3 = 1, 3) = min (: =) = ὃ 

24 δεν 22 
(v) Let 6 = ε2, since 0 < |x| < €? implies that νίκι « ξΞ. 
(i) We need |f(x) -- 2] < ¢/2 and |g(x) -- 4 « ¢/2, so we need 


2 2 
0 < |x — 2| < min (sin (=) ἘΞ =) = ὃ. 


[σ(χ) — 4 < min (Ξ: ot) 


(11) -We need 


2 
so we need 


0 < |x — 2] < [min(2, 8¢)]? = 6. 
Let / = lim f(x) and define g(h) = f(a +h). Then for every € > 0 


za 


there is a 6 > O such that, for all x, if0 < |x — αἱ < 6, then |f(x) -—/ <6. 

Now, if 0 < [ἡ] < 6, then0 < |(A +a) — αἱ « δ, 50 }(α +h) — J] « ε. 

This inequality can be written |g(h) — /| < ¢. Thus, lim g(h) = J, which 
h->0 


can also be written lim f(a + A) = J. The same sort of argument shows 
h-0 ; 
that if lim f(a + A) = m, then lim f(x) = m. So either limit exists if the 
h-0 2a 


other does, and in this case they are equal. 


9. 


15. 


29; 


(a) 


(b 


Nee” 


(c 


Ne” 


(d) 


(a) 


(b) 


(i) 
(ii) 


(iii) 


Answers to Selected Problems 533 


Intuitively, we can get f(x) as close to / as we like if and only if we 
can get f(x) — / as close to 0 as we like. The formal proof is so 
trivial that it takes a bit of work to make it look like a proof at all. 
To be very precise, suppose lim f(x) = / and let g(x) = f(x) — 1. 


Then for all ¢ > 0 there is a 6 > 0 such that, for all x, if 0 < 
Ix — αἱ < 6, then [f(x) — /| « ¢. This last inequality can be written 
g(x) — Ο] « εξ, solim g(x) = 0. The argument in the other direc- 


tion is similarly uninteresting. 
Intuitively, making x close to a is the same as making x — a close 
to 0. Formally: Suppose that lim f(x) = /, and let g(x) = f(x — a). 


Then for all ¢ > 0 there isa 6 > Ὁ such that, for all x, 10 < |x — αἱ 

< 6, then | f(x) — Z| < ¢. Now, if 0 < |y| < 6, then 0 < |(y +a) -- αἱ 

< ὃ, so |f(y + a) — Z| < ε. But this last inequality can be written 

lg(y) — | < €. So lim g(y) = (. The argument in the reverse direc- 
y 


tion is similar. 

Intuitively, x is close to 0 if and only if x? is. Formally: Let 
lim f(x) = 7. For every ¢ > 0 there is a 6 > 0 such that if 
z— 0 


0 « |x| < 6, then | f(x) — Z| < ς. Then if 0 < |x| < min(1, 6), we 
have 0 < |x?) < 6, so | f(x?) — Z| < ¢. Thus, lim f(x) = 7. On the 
z—0 


other hand, if we assume that lim f(x) exists, say lim f(x?) = m, 
z—0 z—0 


then for all ¢ > 0 there is a ὃ such that if 0 < |x| < 6, then 
| f(x?) — m| < ες. Then if 0 < |x| < 83, we have 0 < | δ) < ὃ, so 
Lf([W’x]) — m| < &, or |f(x) — m| < ε. Thus lim f(x) = m. 
x0 
Let f(x) = 1 for x > 0, and f(x) = —1 for x < 0. Then lim f(x?) 
2-0 
= 1, but lim f(x) does not exist. 
2-0 


The function f(x) = 1/x cannot approach a limit at 0, since it 
becomes arbitrarily large near 0. In fact, no matter what 6 > 0 may 
be, there is some x satisfying 0 < |x| < 6, but1/x > |/| + ¢, namely, 
x = min(6, 1/(|d/| + ¢)). This x does not satisfy |1/x — /| < ¢. 

No matter what 6 > 0 may be, there is some x satisfying 0 < |x — 1| 
<6, but 1/(* —1) > [+ 6, namely, x = min(1 + 6,1 +4 1/ 
({/| + ¢)). This x does not satisfy |1/(x — 1) — /| < «. (It is also 
possible to apply Problem 9(b): lim 1/x = lim 1/(« — 1) if the 

%— “-]} 


latter exists, so this limit does not exist, because of part (a).) 
This is the usual definition, simply calling the numbers ὃ and ¢, 
instead of € and 6. 
This is a minor modification of (i): if the condition is true for αἰ 
6 > 0, then it applies to 6/2, so there is an € > 0 such that if 
0 < |x — al « ε, then |f(x) — /| < 6/2 < 6. 
This is a similar modification: apply it to 6/5 to obtain (4). 


534 Answers to Selected Picbiens 


CHAPTER 6 


CHAPTER 7 


2 


28. 


1; 


1. 


1. 


(iv) This is also a modification: it says the same thing as (i), since 
e/10 > 0, and it is only the existence of some € > 0 that is in 
question. 

If lim f(x) = lim f(x) = J, then for every ¢ > 0 there are δι, 62 > 0 


such that, for all x, 


ifa<x<act 4,, then | f(x) — {| « ε, 
ἴα — δι <x <a, then |f(x) --  « ξε. 


Let ὃ = min(6,, δι). If 0 < |x — αἰ < 6, then either a — ὃς <a -- 

d<x <aorelseea<x<até6<a+t 6,80 |f(x) —A <e. 

(i) If 4 = lim f(x), then for all ¢ > 0 there is a 6 > 0 such that 
r—> 0+ 


f(x) --Ῥ < ¢for0 <x < δ If —6 <x <0, then0 < —x < ὃ, 
so |f(—x) — Z| < ¢. Thus lim f(—x) = /. Similarly, if lim f(x) 
207 z—0- 


exists, then lim f(x) exists and has the same value. (Intuitively, 
“- Ot 


x is close to 0 and positive if and only if — x is close to 0 and 
negative. ) 

(ii) If 7 = lim f(x), then for all ¢ > Ὁ there is a 6 > 0 such. that 
xr Ot 


\f(x) — d| < efor 0 <x < 6. So if 0 < |x| < 6, then |f(|x|) — ἡ 
<eé. Thus lim /f(|x}) = 7. The reverse direction is similar. 
xz 0 


(Intuitively, if x is close to 0, then |x| is close to 0 and positive.) 
(iii) If 7 = lim f(x), then for all ¢ > Ὁ there is a 6 > 0 such that 
«-- Ot 


f(x) -- Π < efor0 <x < 6. If 0 < |x| < V6, then 0 < x? < ὃ, 
so | f(x?) — [ < ¢. Thus lim f(x?) = /. The reverse direction 1s 
x0 


similar. (Intuitively, if x is close to 0, then x? is close to 0 and 
positive. ) 
If/ = lim f(x), then for every ¢ > 0 there is some N such that | f(x) — | 


La ὦ 


< eforx > N, and wecan clearly assume that N > 0. Now, 10 «χα < 
1/N, then 1/x > Ν, so |f(1/x) —/) < ες. Thus lim f(1/x) = ( The 
z— Or 


reverse direction is similar. 


G) F(x) =x + 2 for all x. 
(iii) F(x) = 0 for all x. 


(1) Bounded above and below; minimum value 0; no maximum 
value. 

(iii) Bounded below but not above; minimum value 0. 

(v) Bounded above and below. It is understood that a > —1 (so that 
--« --ἰ «4-5 1.1 —1 <a < 4, thena < —a -- 1,50 f(x) = 


a+ 2forallxin(—a — 1,a+ 1),soa + 2 is the maximum and 


Answers to Selected Problems 535 


minimum value. If -- < a < 0, then f has the minimum value 
a*, and if a > 0, then f has the minimum value 0. Since α + 2 > 
(a +1)? only for [-1—V75]/2<a< [i + V/5]/2, when 
a > —+ the function f has a maximum value only for a < 
[1 + ~/5]/2 (the maximum value being a + 2). 
(vii) Bounded above and below; maximum value 1; minimum value 0. 
(ix) Bounded above and below; maximum value 1; minimum value 
-Ἰ. 
(xi) f has a maximum and minimum value, since 7 is continuous. 
2... n = —2, since f(—2) <0 < f(—1). 
(iii) mn = —1, since f(—1) = —1 <0 < f(0). 
3. (i) If f(x) = «179 + 163/(1 + x? + sin? x), then f is continuous on 
R and f(1) > 0, while f(—2) <0, so f(x) = 0 for some x in 
(—2, 1). 
5. fis constant, for if f took on two different values, then f would take on 
all values in between, which would include irrational values. 


7. (1) f(x) = x; 
(2) f(x) = —x; 
(3) f(x) = al; 
(4) f(x) = —([xl. 


10. Apply Theorem 1 to f — g. 

11. If ΚΟ) =0 or f(1) = 1, choose x = 0 or 1. If f(0) > 0 = 1(0) and 
f(1) <1 = (1), then Problem 10 applied to f and J implies that 
f(x) = x for some x. 


CHAPTER 8 1. (Ὁ) 1 is the greatest element, and the greatest lower bound is 0, which 
is not in the set. 

(iii) 1 is the greatest element, and 0 is the least element. 

(v) Since {x: x2? +x +12 0} = R, there is no least upper bound or 
greatest lower bound. _ 

(vii) Since {x:x « Oandx? +x —1 <0} = ([--ἰ| — V5]/2, 0), the 
greatest lower bound is [—1 — V5] /2, and the least upper bound 
is 0; neither belongs to the set. 

2. (a) Since A ¥ Q@, there is some x in A. Then —x is in —A, so —A # Ό. 
Since A is bounded above, there is some y such that y > x for all x in 
A. Then —y < —x for all x in A, so —y < z for all z in —A, so 
—A is bounded below. Let a = sup(—A). Then @ is an upper 
bound for — A, so, reversing the argument just given, —a@ is a lower 
bound for A. 
Moreover, if 8 is any lower bound for A, then —f is an upper bound 
for —A, so -- β > a, so 8 < —a. Thus —a is the greatest lower 
bound for A. 

5. (a) If / is the largest integer with / < x, then/ +1 >, butd+1¢ 
x+1<y. So we can let k =/+ 1. (Proof that a largest such 


536 Answers to Selected Problems 


CHAPTER 9 


10. 


4. 


integer / exists: Since N is not bounded above, there is some natural 
number n with —n < x <n. There are consequently only a finite 
number of integers / with —n </ < x. Pick the largest.) 

(Ὁ) Since y — x > 0, there is some natural number n with 1/n < y — x. 
Since ny — nx > 1, there is, by part (a), an integer k with nx < 
k < ny, which means that x « k/n < y. 

(c) Choose r + V2 (s — r)/2. 

(d) By part (b), there is a rational number r with x <7 < y, and there- 
fore a rational number s with x <r<s<_y. Apply part (c) to 
<3. 

Let & be the largest integer < x/a (the solution to Problem 5 shows that 

such a & exists), and let x’ = x — ka > 0. If x — ka = x’ > a, then 

x >(k+1)a, so k+1<-x/a, contradicting the choice of k. So 

O<x’ <a. 

(a) Since any y in B satisfies y > x for all x in A, any y in B is an upper 
bound for A, so y > sup A. 

(b) Part (a) shows that sup A is a lower bound for B, so sup A < inf 28. 

Since x < sup A and y < sup B for every x in A, and y in B, it follows 

that x + y < sup A + sup B. Thus, sup A + sup B is an upper bound 

for A + B, so sup(A + 8) < sup A + sup 8. If x and y are chosen in 

A and B, respectively, so that sup A — x < ¢/2 and sup 8 — y « ¢/2, 

then sup 4 + sup B — (x + y) « εξ. Hence, 


sup(4d + B) >x+y>sup A+ sup 8 -- ε. 


(a) 
τς 
fla) = im TY ἘΣ ως Ὁ 
h-0 A h—0 h 
= lim τ ἐπι ee ren 
r0 a(a + h) a? 


(b) The tangent line through (a, 1/a) is the graph of 


g(x) = > ᾳ — a) += 


ae 
a? a 
If f(x) = g(x), then 
1 2 
ΤΡ 
x a a 


or 


Answers to Selected Problems 537 


2. (a) 
ΟΕ ΤΌΣ 
f(a) = νι 55 -Ὁ Ὁ - Κῶ _ im +H ot 
na 
= (tah της - 2, 
ho ha?(a + h)? a? 
(0) The tangent line through (a, 1/a?) is the graph of 
2 1 
g(x) = ~ =, & = a) te 
Seg 
gs @ 


If f(x) = g(x), then 


or 
2x3 — 3ax? + a? = 
or 


0 = (x — a)(2x? — ax — a?) = (x — a)(2x + a)(x — a). 


Sox = aorx = —a/2; the point (—a/2, 4/a”) lies on the opposite 
side of the vertical axis from (a, 1/a?). 


3. 
f(a) = tim He Ἔ δ) — Fe) ss vari ~ Ve EY 
h-0 ae 
(WaT h- VaVeth + Vi Vath+Ve) ὦ 
op h(Va t+h+ Va) m0 (Va th + Va) 
1 
να 


4. Conjecture: S,’(x) = nx"—!. Proof: 
BGS) og Re 


S,/(x) = lim 
h—0 h—0 h 
» ἰὼν — x” 
J 
Sin 
h—0 h 


= lim (") x"—ipi-l 
h-+0 7 
j=l 


= (") x"~) = πα ΠῚ, since lim 7} = 0 forj > 1. 


— 


538 Answers to Selected Problems 


10. 


11. 


f(x) = 0 for x not an integer, and f’(x) is not defined if x is an integer. 
(a) 
ee tA) πῶ ως Leth +d—-Ue) +4 

h 


σ΄ (x) = lim 
h-0 h-0 h 
= lim 2 Ἔ) τ. FO): ΠΝ 
h-0 ἀ 
(b) 
g(x) = lim lim 
h->0 h-0 


g(x th) — g(x) _ |, cf +A) — ofl) 
h h 


=c:lim Δ κεῖ Vos mercies Cy (x). 


h-0 


(4) Κ[(9) = 3:9"; [(25) Ξ 3: (25)"; (36) Ξ 5: (30). 

(Ὁ) {32 = f'(9) = 3: 955 [ (52 = (25) = 5: (25)"; 
{[(62) = f'(36) = 53: (36). 

(c) f’(a?) = 3(a)? = 3at; f’(x?) = 3(x?)? = 3x4. 

(d) f’(x?) = 3x4; but g(x) = x8, so g’(x) = 6x°. 


(a) 
“Ὁ = lim a ae _ pelea ieee 


_ τῳ JA +c] +h) — fx +c) 
h-0 h 


= [ἃ + ὦ. 
(b) 


g(x) = lim 
h-0 


ee +A) = 0) τω flex + oh) = flex) 
h 


h—-0 h 


oe εἶ f(cx + ch) — flex)] ae c[f(cx +k) — f(cx)] 
h—0 ch k—-0 k 
= ig AEE me et. 


(Compare the manipulations in this calculation with Problem 5-13.) 
(c) If g(x) = f(x +a), then g’(x) = f(x +a), by part (a). But 
g = f, so f'(x) = g’(x) = [ +4) for all x, which means that f’ 
is periodic, with period a. 
(i) If g(x) = x, then g’(x) = 5x4. Now f(x) = g(x + 5), so by Prob- 
lem 8(a), f(x) = o’(x + 5) = 5(χ + 5)". 
(ii) f(x) = (« — 3)5, so f’(x) = 5(x — 3)4, as in part (i). 
(iii) f(x) = (x + 2)1, so f’(x) = 7(x + 2), as in part (i). 
If f(x) = g(t + x), then f’(x) = σ΄ (ἐ + x), by Problem 8(a). If f(t) = 
g(t + x), then f’(t) = σ΄ (ἐ + x), by Problem 8(a), so f’(x) = g’(2x). 
(a) If s(t) = ct?, then σ΄ (ἢ) = 2ct, and there is no number & such that 
s(t) = ks(t) [that is, 2ct = kct?] for all ¢. 
(By the way, at this point we do not know any nonzero function f for 


CHAPTER 10 


21. 


26. 


30. 


1. 


Answers to Selected Problems 539 


which {7 is proportional to f. After Chapter 17 it might be amusing 
to determine what the world would be like if Galileo were correct.) 

(Ὁ) (i) If s(t) = (a/2)t?, then s’(t) = at, so s’"(t) = a. 

(in) [s(0) |? = Gab)? 2a (a2) Aas). 

(c) The chandelier falls s(¢) = 162? feet in ¢ seconds, so it falls 400 feet 
in ¢ seconds if 400 = 1622, or ¢ = 5. After 5 seconds the velocity will 
be s’(5) = 5a = 5:16 = 80 feet per second. The speed was half 
this amount when 40 = ς΄ (ὦ) = 16t, or t = ὃ. 

(a) This is another way of writing the definition (see Problem 5-8). 

(b) This follows from Problem 5-10, applied to the functions a(h) = 
[f(a +h) — f(a)]/h and BA) = [g(a + h) — gla)]/h. 


GQ) [΄0) = 6x. 
Gii) 5) = 4... 
(i) means that f’(a) = na”! if f(x) = x”. 
(iii) means that g’(a) = f’(a) if g(x) = f(x) +. 
(v) means the same as (iii). | 
(vii) means that g/(d) = f’(6 +a) if g(x) = f(x +). 
(ix) means that g’(b) = cf’(cb) if g(x) = f(cx). 
(i) (1 + 2x) - cos(x + x?). 
(iii) (-- sin x) - cos(cos x). 
(= Ἢ —x sin x — cos x 

(v) cos ᾿ 3 ; 
(vii) (cos(x + sin x))* (1 + cos x). 
(i) — (cos((x + 1)%(x + 2))) [2 + 1) + 2) + (ὦ + 1}}}. 
(iii) [2 sin((x + sin x)2) cos((x + sin x)?)]+ 2(~ + sin x)(1 + cos x). 
(v)  (cos(x sin x)) « (sin x + x cos x) + (cos(sin x?) (cos Set eee 
(vii) (2 sin x cos x sin x? sin? x) + (2x cos x? sin? x sin? x?) 

+ (sin x? cos x? sin? x sin x?). 
(ix) 6(x + sin® x)5(1 + 5 sin‘ x cos x). 
(xi) cos(sin? x7 + 1)7- 7(sin? x7 + 1)5 (7 sin® x7 > cos x7 + 7χὅ). 
(xiii) cos(x? + sin(x? + sin x?)) " [(2x + cos(x? + sin x’) 

* (2x + 2x cos x?))]. 

(1 + sin x)(2x cos x?+ sin? x + sin x?+ 2 sin x cos x) 
(xv) — cos x sin x? sin? x 

(1 + sin x)? 
3 

(xvii) cos 


x3 x° 3x2 sin x — x? COS x 
3x? sin | — — x? cos | — i a Gee ρα 
sin x sin x sin? x 
eS te 


540 Answers to Selected Problems 


(ee 1)" 
4. (i) Ge - 2)? 
(iii) = 2x?, 
5. () —x?, 
(iii) 17. 


6. (i) 55) = ε΄ + g(a). 
(1) f(x) = σ΄ + g(x))- (1 + g’(x)). 
(v) 776) = g(a). 
7. (a) A’(t) = 2mr(t)r’(t). Since r’(t) = 6 for that ¢ with r(t) = 4, it a 
that A’(t) = 2π' 46 = 48m when 7(t) = 4. 
(b) If Μά) is the volume at time ἐ, then V(t) = Anr(t)3/3, so Κ΄ (ἢ = 
Amr(t)*r’(t) = 4+ 42-6 = 384m when r(t) = 4. 
(c) First method: Since 4΄ (ἢ = 2mr(t)r’(t), and A’(t) = 5 for r(t) = 3, 
it follows that 
_ Α΄ (ἢ 5 


=— wh t) = 3. 
2mr(t) ὄπ neo np 


Thus 


Ψ' (ἢ = άπτ(ι) ξγ΄ (ἢ 
= 4π'9: = 
67r 


= 30 when 7(t) = 3. 
To apply the second method, we first note that if 
Κῶ = 4(0" = VAS, 


then, using Problem 9-3 and the Chain Rule, 


1 
/ Ξε: πε BAA! 
£0 = Fam λΘ Ὁ 


1 
=> —— ° 3A t 2A’ t 
2A(t) 82 ( ) ( 
= $A(t)"?A'(t) (just as we might have guessed). 


Now 
Amr(t)? — 4ar[r(t)?]?/? 
ΝΥ ΝΣ 
4{πτ(}3]3} 
535π113 
44(ἢ5")3 
37 1/2 


Vit) = 


CHAPTER 11 


8. 


10. 


30. 


1: 


(i) 
(iii) 


So 


Answers to Selected Problems 541 


, 3 ’ 
A ead cera ~V A(t) A’(t) 
2 
== (DA 
= 2-3-5 = 30. 


(feh)’(0) = f’(A(0)) - ἀ"(0) = f’(3) - sin*(sin 1) 
= 9(sin 4)sin*(sin 1). 
ot 6c) = A(x!) 2x? = sin*(sin(x* + 1)) 5 2x? 


The Chain Rule implies that 


(iii) 


(v) 


(1) 
(iii) 


(v) 


(fo g)'(x) = f'(g)) + ς΄ () 


(yi 


1 
2)? g'(x). 
dz ἀξ ὦ _ (cos y) + (1 + 2x) = (cos(x + x%)) + (1 + 22). 
dx dy dx 
dz a dz du = (— sin u) " (cos x) = (— cos(sin x)) * (cos x). 
dx du d. 
0 = f’(x) = 3x? — 2x — 8 forx = 2andx = --, both of which 


are in [—2, 2]; 
f(—2) = 5, 1) = —11, f(—$) = AMA; 
paar i = 223 minimum = —11. 
= f'(x) = 12χ3 — 24x? + 12% = 12x(x? — 2x + 1) for x = 0 
a x= 1, τ which only 0 is in [—4, 3]; 


f(-4) = té, κῃ) = te, Κ0) = 9; 


maximum = Ἐξ, minimum = 0. 
0= f(x) = 
x?+1—@+1)2e 1 -- 2x — x? 
(x? +b 1)? (x? + 1)? 


forx = —1 + V2andx = —1 — V2,ofwhich only —1 + V2 
is in [—1, 5]; ᾿ ᾿ 

f(-1) = 0, f@) = & κατ + V2) = α + ν 2,2; 
maximum = (1 + N/ OY /2: minimum = 0. 

—4 is a local maximum point, and 2 is a local minimum point. 
0 is a local minimum point, and there are no local maximum 
points. 

—1 + V2 is a local maximum point, and —1 — V2 is a local 
minimum point. 


542 Answers to Selected Problems 


(a) Notice that f actually has a minimum value, since f is a polynomial 
function of even degree. The minimum occurs at a point x with 


C=7) = 2) (x — a), 


sox = (αι + °°° +.a,)/n. 


(i) 3 and 7 are local maximum points, and 5 and 9 are local minimum 
points. 

(iii) All irrational x > 0 are local minimum points, and all irrational 
x « 0 are local maximum points. 

(v) x isa local maximum (minimum) point if the decimal expansion 
contains (does not contain) a 5. 

If f(x) is the total length of the path, then 


fle) = Vert α- ΝᾺ — x)? + 63. 


The positive function f clearly has a minimum, since lim f(x) = 


“πὸ 


lim f(x) = ©, and f is differentiable everywhere, so the minimum 
I —- ὦ 


occurs at a point x with f’(x) = 0. Now, f’(x) = 0 when 


nk eee ie Ce) eee, 
Veta Vi-x?+h2 — 
This equation says that arctan a = arctan 8. 

It is also possible to notice that f(x) is equal to the sum of the lengths 
of the dashed line segment and the line segment from (x, 0) to (1, 4). 
This is shortest when the two line segments lie along a line (because of 
Problem 4-8(b), if a rigorous reason is required); a little plane geometry 
shows that this happens when a = 8. 

If x is the length of one side of a rectangle of perimeter P, then the length 
of the other side is (P — 2x)/2, so the area is 


Ais x(P — 2x) 


So the rectangle with greatest area occurs when x is the maximum point 
for f on (0, P/2). Since A is continuous on [0, P/2], and A(0) = 
A(P/2) = 0, and A(x) >0 for x in (0, P/2), the maximum exists. 
Since A is differentiable on (0, P/2), the minimum point x satisfies 


P— 2x 


0 = A’(x) = 


sox = P/4. 


16. 


Answers to Selected Problems 543 


Let S(r) be the surface area of the right circular cylinder of volume V 
with radius r. Since 


V = ark where h is the height, 


we have h = V/zr?, so 


S(r) = 2πτ2 + 2πτλ 


== 2πτ3 + an 
r 


We want the minimum point of Son (0, ©); this exists, since lim S(7) = 


r—0 


lim S(r) = ©. Since S is differentiable on (0, ©), the minimum point r 


satisfies 
2 
0 = S’(r) = 4ar — ad 
7? 
ΒΕ 4πτ3 — 2V 
ἘΞΞ er τ᾿ 
or 


Ἵν 
T= ----. 
2π 


1 is a local maximum point, and 3 is a local minimum point. 
(a) We have 


A) - fe) = f'(x) for some x in (a, 6) 
> M, 
so f(b) — f(a) > M(b — a). 
(b) We have 
fate = f'(x) for some x in (a, ὁ) 
. <m 


> 


so f(b) — f(a) < m(b — a). 
(c) If |f(x)| < M for all x in [a, 5], then —M < f(x) < M, so 
f(a) — M(b — a) < f(b) < fa) + Mb — a), 


or 


|f(o) — fla)| < MG — a). 


(a) f(x) = — cos x + a for some number a (because f(x) = — cos x is 
one such function, and any two such functions differ by a constant 
function). 


544 Answers to Selected Problems 


17. 


20. 


33. 


(Ὁ) f(x) = x*/4 + a for some number a, so f(x) = χῦ 20 + ax + ὁ for 
some numbers a and ὁ. 

(c) Κ΄ () = x?/2 + x3/3 + a for some a, so f’(x) = x3/6 + «4/12 + 
ax + bforsomeaand 6, so f(x) = x4/24 + χῦ 60 + ax?/2 + be +c 
for some numbers a, ὦ, and c. Equivalently, and more simply, 
f(x) = «4/24 + x°/60 + ax? + bx + c for some numbers a, b, and c. 

(a) Since s’’(t) = —32, we have s’(t) = —32t-+ a@ for some a, so 
s(t) = —16t? + at + B for some a and 8B. 

(b) Clearly, s(0) Ξῷ 0 - 0 -Ἡ 6 and s’(0) = 0 + a. Thus, a = vo and 
B = Jo. 

(c) In this case, so = Ὁ and vp = v, sos(t) = —16t2 + vt. The maximum 
value of s occurs when 0 = s’(t) = —32t+ v, or t = v/32, so the 
maximum value is 


(τ τῷ} 


—7?2 Ὁ 
er) .32 
υ 
64 


At that moment the velocity is clearly 0, but the acceleration is —32 
(as at any time). The weight hits the ground at time ¢ > 0 when 


0 = s(t) = —16#? 4 ut, 


or t = v/16 (it takes as long to fall back down as it took to reach the 
top). The velocity is then 
—32 (= ἢ Ἔν 


s’(v/16) 
=—y 


(the same velocity with which it was initially moving upward). 
Apply the Mean Value Theorem to f(x) = Vx on [64, 66]: 


uk Ean ee 


τ᾿ = = f'(x) = τος for some x in [64, 66]. 
Since 64 < x < 81, we have 8 < νὰ « 9, so 
1 V 66 — ὃ 1 


29 2 2°8 


L’H6pital’s Rule does not lead to the equation 


because lim 3x? + 1 = 0. 


z—1 


Answers to Selected Problems 545 


(i) 
lim = lim = lim cos? x = 1. 
90 tanx 70sec? x 2—>0 
(ii) 
cos? x — 1 . 2sin x cos x 
= Jim ——————- = 1. 
χ- Ὁ x? 20 2x 


() κω -α - 53 (If y = 710), then x = fly) = 9? +1, 80 
y= ἃ -- 15) 
(iii) γ'πΞ (If y = f-'(x), then 


Ε = y, y rational 
GO) | -- γ, y irrational; 
since +y is rational or irrational if and only if y is, we have 
y = x if x is rational and y = —x if x is irrational, so y = f(x).) 
(v) 
χ, x Fasy...+,4n 
ΓΑ) = 4 aii, * = αι, t= 2,. . 6 ,ἥ 
an, XA = 4}. 
(vii) f= f 


(i) f— is increasing and f—1(x) is not defined for x < 0. 

(iii) 7! is decreasing and f~1(x) is not defined for x < 0. 

Suppose f is increasing. Let a < ὁ. Then f~*(a2) # f~*(d), since foi is 
one-one. So either f—1(a) < f~1(6) or f—'(a) > f-*(0). But if f-'a) > 
{Γ᾽ (ὁ), then 


b= f(f-'(6)) « ΚΓ (α)) = α, 


a contradiction. The proof is similar for decreasing 7, or one can con- 
sider — f instead. 
Clearly, f + g is increasing, for if f(a) < f(b) and g(a) < g(b), then 
(f + g)(a) = fla) + gl@) < f(b) + gb) = (FF + 8)(). 
f+ g is not necessarily increasing; for example, if f(x) = g(x) = *. (But 
f° g is increasing if f(x) > 0 for all x.) 
f° g is increasing, for if a < ὁ, then g(a) < g(d), so f(g(a)) < f(g()). 
(a) If (fo g)(x) = (fe g)(y), so that f(g(x)) = f(g(y)), then g(x) = 
g(y), since f is one-one, so x = y, since g is one-one. 
(fog)! = goto f: for if y = (fe 4) 10), thenx = (fe g)(y) = 
f(g(y)), 80 g(y) = f-*(x), so y = «Ὁ 0)). 
If f(x) = f(y), then 
ax +b ay+6 
cx +d - cy + d 


546 Answers to Selected Problems 


14. 


15. 


50 
acxy + δον + adx + bd = acxy + ady + bex + θα, 
or 
ad(x -- y) = be(x — γ). 


If ad # bc, this implies thatx — y = 0. (Butif ad = bc, then f(x) = f(y) 
for all x and y in the domain of f.) 
If y = f(x), then x = f(y), so 


SO 
Γ᾿) =y = a BE for x ¥ a/c. 
txt — a 


(i) | Those intervals [a, 6] which are contained in (— οο, 0] or [0, 2] or 
[2, «), since f is increasing on (— ©, 0] and [2, ©), and decreas- 
ing on [0, 2]. | 

(iii) Those intervals [a, ὁ] which are contained in (— ©, 0] or [0, ©), 
since f is increasing on (— «, 0] and decreasing on [0, ©). 

The formula for the derivative reads: 

dx 


— 


(In this formula, it is understood that dx/dy means (f~)’(y), while dy/dx 
is an ‘‘expression involving x,”’ and in the final answer x must be replaced 
by y, by means of the equation y = f(x).) 

The computation in Problem 14, when completed, shows that 


dx” _ 1 _ 1 
a " πίχτπ)ππιὶ " ηχὶ 15) 
Ξ-Ξ Lama 
n 


G'(x) = χιΓ 0) + ὁ) — FP) (0) 
x(F~*)"(x) ἘΠ 6) — FF) FO) 
χί 1) (x) f(x) -- τ 0) 

Ξ-Γ 


(i) 
1 1 1 


(0) = GE) = FO ~ amtGin 1) 


Answers to Selected Problems 547 


17. Since 
“1)'y) = 1 
OGL 7.1. 
we have 
<i) oat COE ὦ 
oe ΠΡΌ ΟΣ]; 
_ =f"). 
{0 0}}} 


CHAPTER 13 1. ΤΡ, = {to, . .., ta} is the partition with ἐς = ib/n, then 
tr 


(t1)9* (és — ἐμ.) 


L(f, Pn) 


I 
ὩΣ 
| 

p= 

ww” 

ὡ» 
ἌΝ 
| 


ae 7 wee στ ΕἾ ma) 


and similarly 


baw. 
UU, Pa) = τ > 7 


J=1 


Clearly L(f, Pn) and U(f, Pn) can be made as close to b*/4 as desired by 
choosing ,. sufficiently large, so U(f, Px) — L(f, Pn) can be made as 
small as desired, by choosing n large enough. This shows that / 15 
integrable. Moreover, there is only one number a with L(f, Pn) S as 
U(f, Pn) for all n; since [ ° x3 dx has this property, the proof that if x3 dx 
= 54/4 will he complete once we show that L(f, Pn) < 64/4 < UU, P,) 
for all n. This can be done by a straightforward computation, but it 
actually follows from the fact that L(f, Pn) and U(f, Pn) can be made 
as close to 64/4 as desired by choosing n sufficiently large. In fact, if it 
were true that 54/4 < A x? dx, then it would not be possible to make 
U(f, P,) as close as desired to 64/4 by choosing n large enough, since each 
Uf, Pa) => i χϑ dx, and similarly we cannot have 54/4 > ΙΝ x dx, 


548 Answers to Selected Problems 


6. 


αι -- ἡ ἀπ}. αι τὴ} (w= 1) 
so gg τ 


n> n 
51a 73 πὶ 


Clearly L(f, Pn) and U(f, Pn) can be made as close to 6°/5 as desired 
by choosing n large enough. As in Problem 1, this implies that 


i xt dx = 55/5. 


Lif, Pa) = =| 


a 
Be 
3 
Ne” 
| 
ant 
— 
| 
| 


ὦ fp f=. 
(iii) I * ft = 3. 
(v)  f is not integrable. 
(vii) Ἢ f=. 


(For a rigorous proof that the functions in (i), (iii), and (vii) are 
ἀπ see Problem 15. The values of the integrals, which are 
clear from the geometric picture, can also be deduced rigorously 
by using the ideas in the proof of Problem 15, together with known 
integrals.) 


(1) 
(1G +2)-#]4-5 
(iii) | 
[( — x2) — x2] dx = 
(v) 


[f(t - 26 +4) - ads = 4 


[ Cf f(x)gay) dy) dx = is (f(x) [ g(y) dy) dx (here f(x) is the constant) 


10. 


24. 


Zh. 


= a g(y) dy: ͵ f(x) dx 
(here i, : g(y) dy is the constant). 


(a) Clearly L(f, P) Σ 0 for every partition P. 

(b) Apply part (a) to f — g, and use the fact that 
fP¢-a-=[r-[e 

(a) 0. 

(b) ὅ. 

(a) Clearly 


m(b —a) < Lif, P) < σῷ, P) < Μῷ — 2) 


CHAPTER 14 


31. 


aa 


Answers to Selected Problems 549 


for all partitions P of {a, b]. Consequently, 


m(b — a) < ᾿ f(x) dx « M(b — a). 
Thus 


[ flx) dx 
δὰ  --α 
satisfies  “ μ < Μ. 
(Ὁ) Let m and M be the minimum and maximum values of f on [a, 6]. 
Since f is continuous, it takes on the values m and M, and conse- 
quently the number uy of part (a). 


Since 
. 7 S19 ls 
we have 
Ss | Pe, |i, 
50 


hed 


(Problem 30 implies that i. | {| makes sense.) 


(i) (sin? x8) + 3x2, 

- 1 
τὰς Ut 
an) i 1+ 22+ sin*? 


ὃ 1 
eee 
(v) [ ites sin? t 


, 1 
li "(x)= ————— = F(x). 
vii) (PYG) = BEG - ὦ 
(i) All x 41. 
(iii) All x ¥ 1. 
(v) All x. 


(vii) Allx # 0. (Fis not differentiable at 0 because F(x) = Oforx < 0, 
but there are x > 0 arbitrarily close to 0 with F(x) = 4.) 
(i) 
1 1 


NO) = FEO) ~ T+ sinsing=))) 
1 


= 1 + sin(sin 0)) a 
F(x) = x i f(t) dt, so 


ΡΟ) = xf(x) + fy Κὸ at. 


w= 0 att) 


550 Answers to Selected Problems 


9, We can choose 


xin +1 
fe) = —— 
eae 
7 
Then 
b 
ἣ paint 
Vxdx = f(b) — f(0) = = 
0 ᾿ +1 
CHAPTER 15 1. (i) 
1 . 1 . 1 ° 
1 + arctan?(arctan x) 1 + arctan?x 1+ + 
(111) 
ene See (sec: x arctan x + = ) 
1 + (tan x arctan x)? 1+ x? 
2. (i) 0. 
(iii) 0. 
(ν) 0. 


6. (a) sin 2x = sin(x + x) = sin xcos x + cos x sin x = 2 sin x Cos x. 
cos 2x = cos? x — sin? x = 2 cos?x —1= 1 — 2 sin’ x. 
sin 3x = sin(2x + x) = sin 2x cos x + cos 2x sin x 
= 2 sin x cos? x + (cos? x — sin? x) sin x 
= 3 sin x cos? x — sin? x. 
cos 3x = cos(2x + x) = cos 2x cos x — sin 2x sin x 
= (cos? x — sin? x) cos x = 2 sin? x cos x 
= cos? x — sin? x cos x — 2 sin? x cos x 
= cos? x — 3 sin? x cos x 
= 4 cos? x — 3 cos x. 
(b) Since cos 7/4 > 0 and 


we have cos 7/4 = V 2/2. It follows, since sin 7/4 > 0 and sin? + 
cos? = 1, that sin 7/4 = V 2/2, and consequently tan 7/4 = 1. 
Similarly, since cos 7/6 > 0 and 


T 
a aS ον 


8. 


(a) 


tan(arctan x + arctan y) = 


Answers to Selected Problems 551 


we have cos 7/6 = V3 /2. It follows, since sin 7/6 > 0, that 
sin 7/6 = νι - (V'3/2)2 = ἐ, 


sin(x + y) 
cos(x + y) 


sin x cos y + cos x sin y 


tan(x + y) = 


cos x cos y — sin x sin y 
sin x COs y ἣν cos x sin y 
cos x COS y cos x cos y 
cos x cosy sin x sin y 
cOSxcOSy cosxcos y 
tan x + tan y 
1 — tan x tan γ 


(0) From part (a) we have 


tan(arctan x) + tan(arctan y) 
1 — tan(arctan x)tan(arctan y) 
ΒΡῈ ed 

- (23 


provided that arctan x, arctan y, and arctan x + arctan y # 
ka + 1/2. Since —7/2 < arctan x, arctan y < 7/2, this is always 
the case except when arctan x + arctan y = +7/2, which is equiva- 
lent to xy = 1. From this equation we can conclude that 


arctan x + arctan y = arctan (+ +) ) 
1 — xy 


provided that arctan x + arctan y lies in (--π 2, 7/2), which is 
true whenever xy < 1. (If x, y > 0 and xy > 1, so that arctan x + 
arctan y > 7/2, then we must add 7 to the right side, and if x, 
y <0 and xy > 1, so that arctan x + arctan y < —7/2, then we 
must subtract 77.) 


10. The first formula is derived by subtracting the second of the following 


two equations from the first: 


cos (m — n)x = cos (mx — nx) = cos mx cos(—nx) — sin mx sin(—nx) 


= (05 mx cos nx + sin mx sin nx, 


cos (m + n)x = cos mx COS nx — sin mx sin nx. 


The other formulas are derived similarly. 


552 Answers to Selected Problems 


11. It follows from Problem 10 that if γι ¥ n, then 


[΄ sin mx sin nx dx = ἃ [΄. [cos(m — n)x — cos(m + n)x] dx 


me. ores _ sin + a] 


2 m—n m+n 
" eae —n)T - er 
m— ἢ mtn 


= Q, 
But if m = n, then 


[7 sin mx sin nx ἀκ = 4 ["_ 1 — cos(m + n)x dx 
ἐ{{π — cos(m + π)π] — [—a2 — cos(m + n)r]} 


= T. 


The other formulas are proved similarly. 
14. (a) We have 
cos 2x = cos? x — sin? x 
1 — 2 sin? x 
2 cos? x — 1. 


| 


So 

1 — cos 2x 
age ) 
_ 1+ cos 2x 
--.ς- 


sin? x = 
cos? x 


(Ὁ) These formulas follow from part (a), because cos x/2 > 0 and 
sin x/2 > 0 (since 0 < x < 7/2). 


(c) 
ee δ1 — cos 2x 1 oe 
sin? x dx = πες de = 5 (ὁ — a) — 5 (sin 2) — sin 26). 
b ὃ 
[cost ade = [ES ay = 5 (6 — 0) +5 (sin 25 -- sin 20) 


18. (a) arctan 1 — arctan 0 = 7/4. 
(b) lim arctan x — arctan 0 = 7/2. 


19. lim x oes = lim τοῖς = 1, 
.--" © x 2—0* X 


neve = ck, Gane 
20. (a) (sin®°)’(x) a0 cos (: =) cos°(x). 


0)/(¢) = 7. sin (7) = A" sin’ 
(cos°)’(x) πὸ. 2 (=) rr sin°(x). 


Answers to Selected Problems 553 


ies : / 
Oe Se σὸ τς lim 7 « Sin(@x/180) πὸ 
230 (Xx 20 x 290180 1x/180 180 
lim “τ = lim SY ἘΑ͂Ν ἘΠῚ 
“-- © x zx 0+ x 180 


CHAPTER 17 1. (i) eet - ee ots φῆ, 
(iii) (sin x)*™ in 2) [(log(sin x)) " cos(sin x) * cos x 
+ (cos x/sin x) " sin(sin ~x)]. 
(v) — sin x" 2""*[(log(sin x)) " sin x*!"* 
- {(log(sin x)) - cos x + (cos x/sin x) + sin x} 
+ (cos x/sin x) + sin x7], 


" ; x \ Pistia ε") x _ (cos ¢*)e* 
(vil) | arcsin { - log { arcsin [ -- oo 
sin x sin x sin εὖ 
sin x — COS x 
2 
arcsin ( ie vi - ( ἐξ ‘sin? x 
sin x sin x 


(ix) (log x)= - oe log ὃ 2 iris ‘| 


+ log(sin 45)" 


log x x 
3. (i) 0. 
(iii) ἐ. 
(v) 4. 
5. (a) 
—z\ 2 e— —2z\ 2 
cosh? x — sinh? x = Me: -( εἶ ) 
2 2 
ΘΠ 1. ge er 1 e 
He oa ced 5...Ε- 
Stat τ Ε 
ΞΕ ἢ; 


(c) 


— get —Y ἘΠ ev — 
sinh xcosh y + cosh xsinh y = ( \(F )τ(Ξ \( πὴ 


[ erty granu ετῖῦν ἐν | | erty oe) ετῖν | 


4 4 4 4 


(e) Since 


554 Answers to Selected Problems 


7. 


we have 
sinh’(x) = ᾿ = cosh x. 
(g) Since 
oe sinh x 
cosh x 
we have 
νος Ν 2 
ἀπ  ΞΞ (cosh x) (sinh x) 
cosh? x 
= by part (a). 
cosh? x YP (a) 


(a) If y = arg cosh x, then x > Ὁ and 


x = cosh y = V1+ sinh? y by part (a). 
So 


sinh(arg cosh x) = sinh y = Vx? — 1 since sinh y > Ofory > 0. 
(c) 


: 1 
inh) (x) = τε ιοτει  Ξ-- - 
apy) sinh’(arg sinh(x)) 
_ 1 
cosh(arg sinh(x)) 
1 
= --- --Ξ--- b). 
7 by part (b) 
1 
e arg tanh)’(x) = ------.------ 
te) (arg ye tanh’(arg tanh(«)) 
= cosh?(arg tanh(x)). 
Now, 
1 
tanh? y + = 1 by Problem 5(b), 
cosh? y 
50 


1 


ah? h PELE PORT OE ON ae 
n (arg tan (x)) + cosh*(arg tanh(*)) 


2 


or 
1 


Ί --χ 


cosh?(arg tanh(x)) = - 


(a) If y = arg sinh x, then 


ey —e ¥ 


x = sinhy = 


Answers to Selected Problems 555. 


80 
eu — & 4 = 2x, 
ον — 2xe¥ — 1 = 0, 
i eee Τά Εν] 
2 
SO 
ew=xr+VvVics+x? since ce >0 
or 
y = arg sinh x = log(x + ΨΝ1- Χ}: 
Similarly, 
arg cosh x = log(x + Mgt Ὁ. ἡ} 
arg tanh x = ὁ log(1 + x) -- ξΙορί -- x). 
(b) 
ὃ 1 
| ——————- dx = arg sinh ὁ — arg sinh α by Problem 6(c) 
« ΨΊ -Ὁ x 


= Ἰορ + ΝΊ + δὴ — log(a + V1 + a). 


ὃ 1 aoe τ τς 
| —-=—= dx = log(d + Vb? = 1) — log(a + Va? — 1). 
« Vx? — J 
aes | 1 
ae dx = : [log(1 + 6) -- log(1 -- 6) — log(1 + a) + log(1 -- a)]. 
8. (a) lim a? = lim e7!**, Since log a < 0, we have lim x loga = — %, 


so lim e7!°#? = (. 
rin Se i So. 
Xx yr © ey 


xr @ 


(e) lim x* = lim ¢7!**, Now, lim x log x = 0 by part (d), so lim x* = 1. 
x 0* 


Ds. (in sd eapSiow ay =e 
ey nie ee lim log(1 +-9)/y = 1. 
() 
e = exp(1) = exp(lim x log(1 + 1/x)) 
(*) = lim exp(x log(1 + 1/x)) 


rt 0 


lim (1 + 1/x)*. 


x— © 


I 


(The starred equality depends on the continuity of exp at 1, and 
can be justified as follows. For every εξ > 0 there is some 6 > 0 such 
that |e — exp y| < ¢for|y — 1| < 6. Moreover, there is some δ΄ such 
that |x log(1 + 1/x) — 1] < dforx > N. Sole — xlog(1 + 1/x)|< ε 
for x > N. 


556 Answers to Selected Problems 


CHAPTER 18 


15. 


16. 


we 


18. 


1. 


(d) 


% 
a 
I 


flim (1 + 1/x)7]* = lim (1 + 1/%)? 
lim (1 + 1/x)* 


ax © 


lim (1 + a/y)". 
yn 0 


Ι 


After one year the number of dollars yielded by an initial investment of 
one dollar will be 

lim (1 + a/100 χ) = 62/100, 
Clearly f(x) = 1/x for x >0. If χ <0, then f(x) = log(—x), so 
fA) = Cl) 1 (ex) γα: | 
Let g(x) = f(x) /e*. Then 


bes. tty ey ee 
g(x) = ee 


= 0, 


so there is some number & such that g(x) = k for all x. 
(a) According to Problem 16, there is some & such that A(t) = ke. 
Then k = ke®t = Ap. So A(t) = Age". 
(b) If 4(ἰ + 7) = A(t)/2, then 
Ae* 


Agettt¢r = 5 
2 


so eve" = ροί "2 ore = $,s07 = —(log 2)/c. It is easy ἴο check that 
this 7 does work. | 
Newton’s law states that, for a certain (positive) number ὁ, 


T'(t) = (T ~ M), 
which can be written 
(Τ᾽ — Μὴ’ = «(Τ — M). 
So by Problem 16 there is some number k such that 
T(t) — M = ke*, 
and k = ket = T(0) = 70. So T(t) = M + Toe". 


(i) (Wx3 + Wx /V x = x1/10 + x73). 


Gi 1 —Ve-1-Vet1 
Ve—-14t+Ve41 — -2 


(4) (e® 625. + 68") fete = eo 38% 4p bt, 
(iv) a®/b® = (a/b)? = eF lee (a/b), 


Answers to Selected Problems 557 


(v) tan? «x = sec? x — 1. 


a ee 
gi 2 ie (2) 
a 
ἣν εξ τς ὡς ae 
γι να3-- χ'ὶ Vi- (x/a)? 
. 1 1 — sin x 1 —sinx 
(viii) fie eee ee Ξ sec? x — sec x tan x. 
8x? + 6x + 4 6 
ep τ ele 
1 _ 1 
Ἰὼ V Oe a ye N/m (x — 1)? 
2. (i) — cos e*. (Let u = 45.) 


(ili) (log x)?/2. (Let u = log x.) 
(v)  e*. (Let u = e.) 
(vii) 2ev?, (Let u = Vx.) 
(ix) | —(log(cos x))?/2. (Let u = log(cos x).) 
3. (i) [χϑὲ de = χϑοῦ — [2xe* dx = x%e* — [2xe* — fe* dx] 
= x%e7 — 2χοῖ + 2e”. 


(iii) We have 


= e* sin bx = 

e sin bx dx = ————— — -- | e* cos bx dx 
a a 

e* sin bx bfe* cos bx ὁ 
= ------- - -| ——— _ — - ] e* (-- sin dx) dx |, 
a a a a 
50 
[ ax ᾿ ὖ d. a ax 5 b b aX b 
e** sin bx dx = ———— e sin bx — --Ῥ------- & cos bx. 
az + b? az + 5? 


(v) Using the result f (log x)? dx = x(log x)? — 2x(log x) + 2x from 
the text, we have 


/ (log χ)" dx = [x(log x)? — 2x(log x) + 2x]log x 


- [: [x(log x)? — 2x(log x) + 2x] dx 
x(log x)? — 2x(log x)? + 2x log x , 


= [ (log x)? dx + 2[x log x — x] — 2x 


x(log x)? — 2x(log x)? + 2x log x 
— [x(log x)? — 2x(log x) + 2x] + 2[% log x — x] — 2x 
= x(log x)? — 3x(log x)? + 6x log x — 6x. 


558 Answers to Selected Problems 


(vii) f sec? x dx = f (sec? x)(sec x) dx = tan x sec x 
— f(tan x)(sec x tan x) dx 


tan x sec x — { sec x(sec? x — 1) dx 
tan x sec x — f sec? x dx + f sec x dx, 


Ι 


850 


f sec? x dx = $ftan_x sec x + log(sec x + tan x)]. 


5 3/2 
(ix) [ Vi t0 « dx = Ξ log x -Ξ f to as 
| x 


3/2 2 
es log « — 5 [5 a 
3 3 
2 3/2 
= - og χα -- - «3/2 


4. (i) Let x = sin μι, dx = cos u du. The integral becomes 


cos u du 
V1 — sin? u 


(iii) Let « = sec u, dx = sec u tan u du. The integral becomes 


= {τ =u = arcsin x 


t 
ae nea = [ sec u du = log(sec u + tan u) 
Vsec2u — 1 

= log(x + Vx? — 1). 


cos u du. The integral becomes 


i 


(v) Let x = sin τ, dx 


d 
lS [cso di = — log(csc u + cot u) 
sin u V1 — sin? u 
taVi Sse 
= ~ tog (2 + “4=*), 
x x 


(vii) Let x = sin u, dx = cos u du. The integral becomes 


| (sin? u cos u)cos u du = [ sin’ u cos? μ du = | (sin u)(1 — cos? u)cos? u du 


: cos? u , cos>u 
J Gin u)(cos?u — costu) du = — 3 5 


_ a- x2)8/2 (1 — x2) 8/2 
3 ; 5 


(ix) Let x = tan u, dx = sec? u du. The integral becomes 


f sec? u du 
Aftan usec u + log(secu + tanu)] by Problem 3(vii) 


= 4[x V4 + x? + log(x + V1 =x") Je 


I 


f sec usec? u du 


Ι 


Answers to Selected Problems 559 


5. (i) Letu= Vx+1,% = y? — 1, de = 2u du. The integral becomes 


2u du 
ic {e+ 55) 


= 2u — 2 log(1 +u) =2Vx+1—2 logit + Vx 1). 


(ili) Let u = x/® x = u® dx = 6u5 du. The integral becomes 


6ue du _ = __ 1 — 943 — 34/2 = 
err as [( u+1 —;) a 2u* — 3u? + 6u — 6 log(u + 1) 
=2Vx—3 Vx +6 Vx — 6 log(Wx +1). 


(v) Let u = tan x, x = arctan u, dx = du/(1 + u®). The integral 
becomes 


[ du --/( 1 ire) 
(1 Ἐῶ τὼ 5 2+tu i1+4u? 


ΕΝ: εἰ [τῇ 
5J/ 2+u 10 oe 1 + u? 
1 


Ξ logt2 +u) - — log (1 + u?) ie 5 arctan u 


jad 


- Be + tanx) — log + tan? x) 22 a 


(vii) Let u = 2%, x = (log u)/(log 2), dx = du/(u log 2). The integral 
becomes 


1 uw? + 1 1 ( ie τ 
ey ese ean “ἢ d 
log2J/ (ut lu log2 Tay) ΩΣ 


1 1 2 
Ξ ge ene d 
ae Le ai) ἃ 
1 


5 lu + log u — 2 log(u + 1)] 


7 
log 2 


[25 + x log 2 — 2 log(27 + 1)]. 


(ix) Letu = V x, x = u®, dx = 2u. The integral becomes 


(ee — u? Qu du 


1—u 


560 Answers to Selected Problems 


Now let wu = sin y, du = cos y dy. The integral becomes 


[ey 2 Δ f Soe (isin) 7, 


1 — siny 1 — sin y 
= 2 fa + sin y)sin y dy 
=2f sinydy+ [1 — cos 2y dy 
= —2cosy+ty — eee —2cosy + y — siny cos y 
= —-2V1 — u? + arcsin u Sa a 
= —2V1 — x + arcsin Vx — VxV1 — x. 
The substitution wu = V1 —x,x = 1 —u2, dx = —2uduleads to 
—2u? du 
3 
1—-V1—w 


and the substitution wu = sin y then leads to 


[ —2sin’ ycosydy _ 


—2 sin y — y — sin y cos y 
1 — cosy 


= —2u — arcsin u — uv 1 — με 


= —2V1 — x — arcsin V 1 says | x Vx. 


These answers agree, since 


τὰς πὴ : 
arcsin ae = 5 -- arcsin V1 — x 


(check this by comparing their derivatives and their values for 
x = 0). 
6. In these problems J will denote the original integral. 


(i) 
2 3 
oa 
3 


(iii) 
1 
"πὰ ΕΝ hy 


— -----.-....ς------.... —— $$ ὄ. .“-.- --ς. 


Answers to Selected Problems 561 


(v) 


1 

Ὁ}. ὝΕΣ, w+ [oe 

= 4 log(x? + 1) + 4 arctan x. 

1 2x 
at τές τος ἢ ΞΘ eee 
Joan*tl ess Τα 
2χ + 1 1 

- {4 e+ loa a es A eres ta 


Now 


= - ax = a Serene ax 
etet1 @+h? +4 


(vii) 


= Baan S(o+4) 
fee se arctan (- (. <2 1)) 
sO 
I = log(x + 1) + log(x? ++%+ 1) -- 2V3 arctan (2. (. of 1)) 


(ix) 


ae eee ee ee ee, en 

rf te join 

- | wttine $f 2 ᾿ τὼ 
ΤΉ (Flt 5) [+4 


Now the substitution 
2 ( ᾿ ν3 
μ -ἰ χ +-—) dx = — du 
"V3 
changes the second integral to 


146 V3 du 


owe eee οὅὄϑ ..... 


9 2 J) (+1)? 


562 Answers to Selected Problems 


Using the reduction formula, this can be written 


ines . 1 du |-- V3 [560 Ὁ Ὁ 3 arctans 


ὮΝ aaa) A lies 9 4 
SO 
1 V3 4 
ees ioe 1 
fq Ὁ og (5 ae ) 


v3 arctan Gea (ο ae ἢ} 


11. The equation fe* sin x dx = e* sin χα — e cosx -- [67 sin x dx means 
that any function F with F’(x) = δ sin x can be written F(x) = δ sin x — 
e® cos x -- G(x) where G is another function with G’(x) = δ᾽ sin x. Of 
course, G = F + c for some number ¢, but it is not necessarily true that 
F=G. 


12. (4) 


: ᾿ ; x 
J aresin x ax = [1 aresin » dx = x arcsin x — [x =« 
1..-- χ" 


x arcsin x + Vi Ξ- 
13. (a) 


3 
: sin* x 008 x x COS x 
[sin x a = = [ sin? x as 


ΕἸ. 
ἘΠῚ 
= __ sin’ x cos x REA ig +3|- we 
4 2 
35 


ee x COS x in x cos x 


ἘΞ eee --- a +: a 
i 2 2 2 
| ΓΞ ( cos ΞῚ ΓΞ [ (: _ COs 2x ἧς cos Ἢ x 
2 4 2 4 


Σ- πε: ee 
x 
4 
3x 
8 


εἶχ 


4 4 2 
sin 2x 1x sin 4x 
rear | aka a 
sin 2x sin 4x 

4 32 


(b) It follows that these two answers are the same, since they have the 
same value for x = 0. 


Answers to Selected Problems 563 


17. (a) 

| sin” x dx = { (sin x)(sin”~! x) dx 
= — cos x sin”! x + (n — 1)f (cos x)(sin"~? x) (cos x) dx 
= —cosxsin™ !x + (n — 1) f (sin"-? x — sin” x) dx, 


50 


. 1 ; n— 1 ᾿ 
| sin” x dx = — -—cosxsin™ !? x + sin”? x dx. 
7 


(b) 


f cos” x dx = (cos x)(cos*~! x) dx 
= sin x cos”"! x + (n — 1)f sin x(cos"~? x)sin x dx 
sin x cos”! x + (n — 1)f(cos"~* x — cos” x) dx, 


50 


1. n— 1 
i cos” x dx = -- sin x cos” !x + cos” 2 x dx. 
n n 


(c) 
x? dx 


ere eer 
As oe eae ae 
ee enieroc! 


[--:-. ᾿ 1 Χ - kan = 2) | dx 
(1+  χ)"» 2(n —1) (x? - 15πι 2(η —1) J (x? +1977? 
We can also use the substitution x = tan u, dx = sec* u du, which 
changes the integral to 


2 
sec? u du 7 
= = J cose 2 u du 
sec*” u 


1 2n — 
= ἢ cos?"—3 y sin u + = : prea u du 


. 1 1 ae ἐν maa απο 
an -τ 2 (Μ1 -Ὁ x2)2"-3 Vit x on — 2 τι ae: 


1 x 4: - ἘΞΞ[ 5. 
"“-Ὲ' -- 1) (4 + χπι + eit 2n-2/ (1+ at 


564 Answers to Selected Problems 


CHAPTER 19 


CHAPTER 21 


1. 


i; 


(111) 


(v) 


(vii) 
(ix) 


P3o(x) = ὁ + ex + ex? + (5e/3!)x3. 
is sae a a στ τρις 


Ponxj2(x) = 1 5} A! + 
(2n)! 
pone 2 en n 
Pails) τόξα 1) + SSI pp UE 
Pao(x) =x + x3. 
Paine) a LR Ge a Oe ae NL ae 


If f is a polynomial function of degree n, then f"T” = 0. It follows from 
Taylor’s Theorem that Ry(x) = 0, so f(x) = ῥ, «(). 


(i) 
(iii) 


(i) 


(iii) 


(v) 


(i) 
(iii) 
(v) 


(i) 
(iii) 


(v) 


(vii) 


(ix) 


—12 + 2% -- 3) + (ἃ — 3)2 
243 + 405(« ~ 3) + 270( — 3)? + 90( — 3)? + 15(% — 3)4 
+ (x — 3)% 


(—1)* (3 1 = ) 
ge —___-__ < 107)7 for 2n +2 > 19, orn> 9). 
Ly Οἱ τ 1)1 (n+ 2)! i  νοΥθ 
8 5 
y τὺ (since oe - « 10~*° for 2n + 2> 18 
LY 2*(22 + 1)! 22"72(2n + 2)! = ss 
orn > 8) 
τ 
Dt 22. 25τῚ 
= (since = < 107° forn+1> 12, orn> 11). 
of 2! (n+ 1)! 
Ci = ay +. b;. 
cg = + 1)a,. 


Co >= Ἢ f(t) dt; C= ai_1/t for 2 > 0. 


1—n/(n+1) =1/n+1) « eforn+1>1/e. 7 
lim Vn? +1—WVnt1 = lim (Wn? +1 — Wn?) + lim (Wn 


— Vn + 1) =0+0=0. (Each of these two limits can be 
proved in the same way that lim (Vn Ἔ[1Δι -- Vn) = 0 was 


n—> © 


proved in the text.) 7 

Clearly lim (log a)/n = 0. So lim Va = lim e#®/" = ¢® (by 
Theorem 1) = 1. 7 

V 0? < Vin? + n< V 2,3 50 (Wn)? < Vn? nS v2 (Wn)?, 
and lim (Vn)? = lim V2 (Vn)? = 1 by parts (v) and (vi). 


n—» © n— ὦ 


Clearly a(n) < log, n, and lim (loge n)/n = 0. 


N—> © n— οο 


Answers to Selected Problems 565 


4. (a) If0 <a <2, then a? < 2a <4, soa < V2a < 2. 
(b) Part (a) shows that 


νὩ «νηνχε γν;ν:ς-... = 2) 
so the sequence converges by Theorem 2. 
(c) If this sequence is denoted by {a,}, then the sequence {V2a_} is 
the same as {an41}. So the hint shows that / = V 21, or / = 2. 


5. If x is rational, then n!xx is a multiple of πὶ for sufficiently large n, so 
(cos nlarx)?* = 1 for all such n, so lim (lim (cos n!ax)”) = 1. If x is irra- 


n720 ko 
tional, then n!zx is not a multiple of 7 for any n, so {cos ηἰπχί < 1, so 
lim (cos n!ax)* = 0, so f(x) = 0. 
k— 00 


1 
6. (i) / e* dx = e — 1. (Use partitions of [0, 1] into n equal parts.) 
0 


1 
1 
111 dx = log 2. 
(iii) [ττ-ὰ- og 


ἷ 1 1 
(v) [ Gag) ee Ἐπ᾿ τὰ ΞΞ > 


CHAPTER 22 1. (i) (Absolutely) convergent, since |(sin n0)/n?| « 1/n?. 
(iii) Divergent, since the first 2n terms have sum Σ + +++ + 1/n. 
(Leibnitz’s Theorem does not apply since the terms are not 
decreasing in absolute value.) 
(v) Convergent, since 


for sufficiently large n. 
(vii) Convergent, since 


(n+ τ τα lim Cy. _— 0. 


lim 
n n+ 1 


n> © n?/n! n— ὦ 


(ix) Divergent, since 1/(log n) > 1/n. 
1 
(xi) Convergent, since 1/(log n)” < τ forn > 9. 


(xiii) Divergent, since 


for large enough n. 
(xv) Divergent, since 


N 
i dx = log(log N) — log(log 3) > © as N— o. 
3 x log x 


566 Answers to Selected Problems 


CHAPTER 23 


Ul - ἢ 1: - ἢ τ ἰς -- 1 Ὁ ..0 τ τ Ἐ Ὲ 
δε πιν ὐνετο νου: 
See ee a ΟΝ τον λα θῇ 

1 1 
-3(1444+5+5+°-:) 
42 3 
ee 
oad 
-- 1 
= 


1. 


(Notice that f(x) = 1/(x log x) is decreasing on [3, ©), since 


—[1 + log x] 


«Ὁ forx>e. 
(x log x)? 


f(x) = 


(xvii) Convergent, since 1/n?(log n) < 1/n? for n > 2. 
(xix) Convergent, since 


2r+1(n + 1)1/(n + 1)π|] 


_ 2(η + 1)n” 
lim ————_  --.- --ς-.---- = jm -ο------’-- 
N—> τὸ 2 nin n> © (n Ἔ {jer 


| 
5 


by Problem 17-12. 
(a) For each N we clearly have 


0 « Σ an10-" «9 S 107" = 1, 
n=l n=1 


oo 


SO a,10-" converges by the boundedness criterion, and lies 
5 y: 


n=l] 


between 0 and 1. (Actually, this number is denoted by 0.aia2asa4.. . 


only when the sequence ee ᾿ not eventually 0.) 
The area of the shaded region is 3. The integral is 


70) = lim fale) = | ἜΜ 


{f,} does not converge uniformly to ἢ. 


Gi) f(x) = lim fn(x) = 0 (since lim x” = © for x > 1). 


n— ® 


The se- 


quence 1 me does not converge uniformly to f; in fact, for any n 


we have f,(x) large for sufficiently large x. 


Answers to Selected Problems 567 


(v) fa) = ΤΠ μα) = 0, and {fn} converges uniformly to f, since 
fa(e)| < 1/n for alll x. 


1 x x? 
7: ΚΣ. Baers ores 
(i) a az κα 
1 
ap Σ (0847) # 
k=0 
Nene 
k 2k+1 
W) 2k + 1 
3. ὃ. 4 
( If 
x? x3 x* 
ἘΣ ΡῈ = <1 
Oe as ae , I< 
then 
3 
fie) ax- S45 n. 


log(1 an Pi |x| <1, 


so for |x| < 1 we have f(x) = (1 + x)log(1 +x) -- (1+) Ἐς 
for some number c. Since f(0) = 0, we have ὁ = 1, so f(x) = 
(1 + x)log(1 + x) — x for |x| < 1. 


4. Since 
sin x = » oe ec el ails 
(2n + 1)! 
we have 
_ Ve (tne 
UM) Ly κι + 1)! 


(notice that the right side is 1 for x = 0). So 


δον ae ΠΕ 
ΚΡ (0) = 4 (2n+ 1)! 
0, k odd 
CHAPTER 24 1. ὦ |3+ 4) = 5; @ = arctan 3 


(iii) [(( + ὃ] = (1 - 2[)® = 1; since 7/4 = arctan 1/1 is an argu- 
ment for 1 + 7, an argument for (1 + 2)5 is 52/4. 
(v) [( + 410] = [5] = 5; 6 = 0. 


568 Answers to Selected Problems 


CHAPTER 26 


(i) 
a wee Ni rd 
x= — a 
3 Se V5 i 
Ser 
_ (1+ Vo). (—1 -- V5)b 
2 2 
(ili) x? + 21x — 1 = (« + 2)%, so the only solution is x = —z. 
(Vv) «8 — x? — x — 2 = (x — 2)(x? +%-+ 1). The solutions are 
1,V3, 1 V3, 
2,--+—14---— i. 
2 2 2 2 


(i) All z = zy with y real. 

(iii) All z on the perpendicular bisector of the line segment between 
a and ὁ. 

(v) No z, since |x +7y) «1 -- τὰ implies that x > 1, and that 
x+y? «1 — 2x + x4, so γ «1 -- 2x, which is impossible, 
since 1 — 2x < 0 for x > 1. 

ty) Sere ye aa Cay) a ie ay, 

(2+ 2/2 = [x + iy) + (e — ay)/2 = x. 

(z — 2)/2 = [ Ἐ iy) — (x — ty)]/2i = γ. 

Jz + wl? + lz — wl? = 2+ ω)γ(ξ - ὧδ) + (ze — w)(Z — ὦ) = 227 + 

2wm = 2(\z|? + 3). Geometrically, this says that the sum of the 

squares of the diagonals of a parallelogram equal the sum of the squares 
of the sides. 


(i) Converges absolutely, since |(1 + 72)"/n!| < (V2)"/n!, and 
> (V2)"/n! converges. 


n=1 


(iii) Converges, but not absolutely, since the real terms form the series 
-E+2-84+4--°-° 
and the imaginary terms form the series 
(Gee ea eS), 
(v) Diverges, since the real terms form the series 


4 
log 3, 4 los 406 5.1 1067.» .1δ8 ὃ Ιορ 9 


2 4 5 7 ὃ gs 


Ἔ + +> 


(i) 


(iii) 


(v) 


(i) 


(iii) 


(v) 


_ Answers to Selected Problems 569 


The limit 
n+] 2 2 
re i Sc agen pr Qa rem 
n—> |z|"/n? n> ὦ n 


is <1 for |z| <1, but >1 for |z| > 1. 
The limit 


is <1 for |z| «1 but >1 for |z| > 1. 


The limit 
25 Ὶ (n+1)! 
lim ie liek = lim 2|2{ Ὁ 91 ηὶ 
n—> ὦ 2 Σ n— @ 


is Ὁ for |z| <1, but © for |z| > 1. 
The limits 


[235 ef ὅπτ [515 τι [2] 


pe 3n = V/3 πάθεα Qnti ν΄} 


are <1 for [2] « V2, so the series converges absolutely for 
lz| < 2. But the series does not converge absolutely for 
|z| > V2, so the radius of convergence is V2. 


Since 
n ' n A/n3 
lim fac = ἡ τς lim “ἢ = 0, 
n—- 2 ΜΕ n— o 7 no ἢ 


the series converges absolutely for all z, so the radius οἵ con- 
vergence is ©. 
The limit 


lim V 2"%2"! = 2 lim 2"! 


n— 7 n> © 


is 0 for |z| < 1, but © for |z| > 1, so the radius of convergence is 1. 


GLOSSARY 
OF SYMBOLS 


P 9 

la| 11 

Vx 12 
max(x,y) 16 
min(x,y) 16 
e (“epsilon”) 19 
N 21 

G 23 

n! 23 

Σ αἱ 24 
i=1 

Z 25 

Q 25 

R 25, 495 


Ὁ 

k 

f(x) 38, 45, 507 
I 41 

Ἐς 41 
AI\B 41 
fig 41 

fig 41 
c-g 41 
lee ox .} 41 
rare ae 
[στὰ 42 
δ΄} 42 

fog 42 

fogoh 43 

x—> f(x) 43 


max(f,g) 49 

min(f,g) 49 

{=e 15 

the pair (a, 5) 52 

the open interval (a, δ) 
54 

[α, Ὁ] 55 

(α, 0) 55 

ία, 0) 55 


573 


(--ο,4) 55 

(-- .ο,4] 55 

(= eo, 00 ) 55 

[x] 70 

{x} 70 

ὃ (“delta”) 78 
lim f(x) 81 

limf 81 

lim f(x) 86 

lim f(x) 86 

zia 

lim f(x) 86 

lim f(x) 87 

χα 

lim f(x) 87 

lim f(x) 92 
lim f(x) = ὦ 92 
lim f(x) = © 92 


sup A 112 
lub A 112 
inf A 112 
glb A 112 
lim A 121 
lim sup A 121 
lim A 121 
f(a) 127 
~ Azz 
af(x) 130 
dx 

df(x) 

dx z=a 
Τ᾽ 153] 
Ὁ 137 
f® 1537 
45[(] 
dx? 
Γ᾿ 201 
R(f, a, 6) 214 
LG, Py 215 
U(f, P) 215 


ie 219 
[1 ἀχ 225 


151 


138 


Lf, "249 
U [ f 249 
[1 254 
[ΙΔ 254 
[ro 254 


[CF 254 
sin® 257 
sin’ 257 

π 258 

A(x) 259 
cos 259, 261 
sin 259, 261 
sec 263 
tan 263 
csc 263 

cot 263 
arcsin 263 
arccos 264 
arctan 265 
“(f, P) 275 
L(x) 276 
log 285 
exp 287 

e 287 

e* 288 

a” 288 

log, 289 
sinh 296 
cosh 296 
tanh 296 2 
arg sinh 296 | 
arg tanh 296 
arg cosh 296 
Nap log 299 
F(x) 5 302° 
Sf 303 


S f(*) dx 303 
I(x) 327 


574 


Glossary of Symbols 


lim a, 373 


n— © 
lim a, = © 
n-7>@ 


y 382 
lim x, 385 


n> ὦ 


lim sup xp 
Nt © 


lim x, 385 


n— @ 


376 


285 


lim inf x, 385 


R— © 


N(n; a,b) 386 


Σ Qn 389 
n=1 


i 433, 439 


C 438 

z 441 

|z| 441 

Re 448 

Im 448 

6 449 

lim f(z) 449 


f(a) 458 
sin 470 
cos 470 
exp 470 
b, 478 
B, 479 
D 479 
DE 479 


eP 479. 

A 479 

gn 481 

Yn 482 

+ 487, 497 
487, 501 

0 487, 498 

1 488, 502 

—a 488, 498 

~™ 488, 502 

489 

490, 496 

490, 496 

490, 496 

490, 496 

501 


IAIVMAV 5 


= 


INDEX 


AalbmcndoE, 235 
Abel, Niels Henrik, 332, 430 
Abel summable, 431 
Abel’s lemma, 385 
Abel’s Theorem, 431 
Absolute value, 11 

of a complex number, 441 
Absolutely convergent, 397, 463 
Absolutely summable, 397 
Acceleration, 137 
_ Acta Eruditorum, 124 

Addition, 3 

associative law for, 9 

commutative law of, 9 

of complex numbers, 438 

geometric interpretation of, 442 

Addition formula 

for arcsin, 270 

for arctan, 270 

for cos, 266 

for sin, 265, 266 

for tan, 269 
Additive identity, existence of, 9 
Additive inverses, existence of, 9 
Algebra, Fundamental Theorem of, 

317, 445, 455, 474 

Algebraic functions, 302 
Algebraic number, 362 
Algebraist’s real numbers, 505 
Almost lower bound, 120 
Almost upper bound, 120 
Analyst’s real numbers, 505 
Angle, 256 

directed, 256 
Antidiagonal, 213 
Arabic numerals, multiplication of, 8 
Arc length, 275 
Arccos, 264 
Archimedes, 116, 119, 224 
Archimedian property 

for the rational numbers, 490 

for the real numbers, 116 
Arcsec, 272, 322 
Arcsin, 263 

addition formula for, 270 

derivative of, 264 

Taylor series for, 429 


577 


Arctan, 264 
addition formula for, 270 
derivative of, 264 
Taylor polynomials for, 335, 342 
remainder term for, 351 
Area, 214, 219 
Arg cosh, 296 
Arg sinh, 296 
Arg tanh, 296 
Argument, 443 
Argument function, 449 
discontinuities of, 453 
Argument of the hyperbolic functions, 
296 
Arithmetic mean, 32 
Arrow, ‘“‘x arrow sin(x?),”? 43 
Ars conjectandi, 481 
Associative law 
for addition, 9 
for multiplication, 9 
Average velocity, 128 
Axis 
horizontal, 55 
imaginary, 441 
real, 441 
vertical, 55 


Bacon, Francis, vi 

Bandes, D., handiwork of, 

Basic properties of numbers, 3 

“Bent” graphs, 125 

Bernoulli, 124, 481 

Bernoulli numbers, 478 

Bernoulli polynomials, 481 

Bernoulli’s inequality, 31 

Big game hunting, mathematical theory 
of, 459 

Binary operation, 487 

Binomial coefficient, 27, 357 

Binomial series, 409, 429 

Binomial theorem, 28 

Bisection argument, 120, 459 

Bohr, Harald, 328 

Bolzano-Weierstrass Theorem, 378, 
386, 459 

Bound 

almost lower, 120 


578 Index 


Bound (Cont.) 

almost upper, 120 

greatest lower, 112 

least upper, 111, 490 

lower, 112 

upper, 111, 490 
Bounded above, 100, 111, 377, 490 
Bounded below, 112, 377 
Bourbaki, Nicholas, 124 


Cauchy, 239 
Cauchy Condensation Theorem, 410 
Cauchy criterion, 390 
Cauchy form of the remainder, 345, 
347 

Cauchy Mean Value Theorem, 178 
Cauchy sequence, 379, 477 
Cauchy sequences, equivalence of, 505 
Cauchy-Hadamard formula, 476 
Cauchy-Schwartz inequality, 239 
Cesaro summable, 408 
Chain Rule, 150 ff. 

proof of, 154 
Change, rate of, 128 
Characteristic (of a field), 492 
Circle, 63 

“fF circle g,” 42 

unit, 64 
Circle of convergence, 466 
Classical notation 

for derivatives, 130-131, 138, 142- 

143, 162, 212 

for integrals, 225 
Cleio, 161 
Closed interval, 55 
Closed rectangle, 454 
Closure under addition, 9 
Closure under multiplication, 9 
Commutative law | 

for addition, 9 

for multiplication, 9 
Comparison test, 391, 407 
Comparison Theorem, Sturm, 275 
Complete induction, 23 
Complete ordered field, 490, 509 
Completing the square, 18, 318 
Complex analysis, 473 


Complex function 

continuous, 453 

differentiable, 457 

graph of, 449 

limit of, 449 

nondifferentiable, 458 

Taylor series for, 469 
Complex nth root, 443 
Complex numbers, 433, 438 

absolute value of, 441 

addition of, 438 

geometric interpretation of, 442 

geometric interpretation of, 442 

imaginary part of, 438 

logarithm of, 477 

modulus of, 441 

multiplication of, 438 

geometric interpretation of, 442— 
443 

real part of, 438 

sequence of, 462 
Complex plane, 440 
Complex power series, 464 

circle of convergence of, 466 

radius of convergence of, 466 
Complex sequences, 462-463 
Complex series, 463 
Complex-valued functions, 448 
Composition of functions, 42 
Concave function, 192 
Conditionally convergent series, 398 
Conjugate, 441, 446 
Constant function, 41 
Construction of the real numbers, 494 
Continuous at a, 93, 452 
Continuous function, 93, 453 

nowhere differentiable, 135, 422 
Continuous on (a, δ), 96 
Continuous on [a, ὁ], 96 
Contraction, 384 
Contraction lemma, 385 
Converge 

pointwise, 415 

uniformly, 415, 419 
Convergent sequence, 373, 462 
Convergent series, 389, 463 

absolutely, 397, 463 

conditionally, 398 


‘Convex function, 191 
Convex hull, 460 
Convex subset of the plane, 460 
Cooling, Newton’s law of, 298 
Coordinate 
first, 55 
second, 55 
Coordinate system, 55 
origin of, 55 
“Corner,” 58 
Cos, 256, 259, 273-274, 276, 470 
addition formula for, 266 
derivative of, 148, 260 
inverse of, see Arccos 
Taylor polynomials for, 334 
remainder term for, 348 
Cosh, 296 
Cosine, hyperbolic, 296 
Cot, 263 
Countable, 369 
Counting numbers, 21 
Critical point, 165 
Critical value, 165 
Csc, 263 
Cubic equation, general solution, 435 


Darboux’s Theorem, 187, 242, 253 
Decimal expansion, 70, 407 
Decreasing function, 170 
Dedekind, Richard, 36 
Defined implicitly, 211 
Definite integral, 304 
DEFINITION, 45 
Definition, recursive, 33 
Degree (of a polynomial), 40 
Degree measurements, 61, 257-258 
Delicate ratio test, 408 
Delicate root test, 408 
De Moivre’s Theorem, 443 
Dense, 118 
Derivative, 125 ff., 127 

classical notation for, 130-131, 138, 

142-143, 162, 212 

of f, 127 

of f at a, 127 

higher-order, 137 

‘infinite,’ 134 


Index 579 


Deritative, left-hand, 132 
Leibnizian notation for, see Deriva- 
tive, classical notation for 
‘‘negative infinity,” 134 
right-hand, 132 
second, 137 
Derivative of quotient, incantation for, 
147 
Diagonal, 202 
Difference operator, 479 
Differentiable, 127, 457 
Differential equation, 247, 253, 273, 
275, 301. 359 
initial conditions for, 360 
Differentiation, 144 ff. 
implicit, 211 
Differentiation operator, 479 
Directed angle, 256 
Discontinuities of a nondecreasing 
function, 370 
Discontinuity, removable, 99 
Disraeli, Benjamin, 2 
Distance, 56, 441 
shortest between two points, 276 
Distributive law, 9 
Diverge, 373, 463 
Division, 6 
Division by zero, 6 
Domain, 38, 39, 45, 507 
Double intersection, 141 
Double root, 160 
Durége, 36 


e, 287 
irrationality of, 353 
relation with π, 368, 471 
transcendentality of, 364 
value of, 288, 350 
Elementary function, 302 
Ellipse, 64 
Empty collection, 23 
Entire function, 474 
Epsilon, 19 
Equal up to order n, 340 
Equality, order of, 340 
Equations, differential, see Differential 
equations 


580 Index 


Equivalent Cauchy sequences, 505 

Euler, 481 

Euler-Maclaurin Summation Formula, 
482 

Euler’s number, 382 

Even function, 49, 174 

Even number, 25 

Eventually inside, 462 


Function, derivative of, 125 ff. 


differentiable, 127 
elementary, 203 
entire, 474 

even, 49, 174 
exponential, 287-288 
extension of, 93 
fixed point of, 384 


Exhaustion, method of, 119 graphs of, 54-63, 173, 449 


Exp, 287 ff., 470 hyperbolic, 296 
Taylor polynomials for, 335 identity, 41 
remainder term for, 349-350 imaginary part, 448 
Expansion, decimal, 70, 407 increasing, 170 
Extension of a function, 93 integrable, 219 


integral of, 219 
| inverse, 201 ff. 
Factorial, 23 linear, 56 
Factorials, table of, 356 local maximum point of, 164 
Factorization into primes, 31 local minimum point of, 164 
Fibonacci, 31 local strict maximum point of, 190 
Fibonacci Association, 32 logarithm, 285, 289 
Fibonacci Quarterly, The, 32 maximum point of, 163 
Fibonacci sequence, 31, 430, 478 maximum value of, 163 


Field, 487 minimum value of, 163 
characteristic of, 492 most general definition of, 507 
complete ordered, 490, 509 negative part of, 49 
ordered, 489 nondecreasing, 213 

First coordinate, 55 nonincreasing, 213 

First Fundamental Theorem of Calcu- nonnegative, 49 

lus, 240 notation for, 43 
First Mean Value Theorem for Inte- odd, 49, 174 
grals, 237 | periodic, 69, 140, 252 

Fixed point of a function, 384 polynomial, 40 

Fourier series, 270, 272 positive part of, 49 

Function, 37, 45 power, 58 
from A to B, 507 product of, 41 
absolute value, 448 quotient of, 41 
argument, 449, 453 rational, 40 
complex-valued, 448 real part, 448 
composition of, 42 real-valued, 448 
poneave tz “reasonable,” 96, 125, 156 


conjugate, 448 


constant, 41 regulated, 431 


continuous, 93 ff. step, 235 . 
convex, 191 strict maximum point of, 190 
sum of, 41 


critical point of, 165 
critical value of, 165 trigonometric, 256 ff. 
decreasing, 170 value of at x, 38 


Fundamental Theorem of Algebra, 317, 
445, 455, 474 
Fundamental Theorem of Calculus 
First, 240 
Second, 244 


Galileo, 124, 140 

Gamma function, 327, 364 
Geometric mean, 32 
Geometric series, 390 

Global properties, 101 

Goes to, ‘“‘x goes to sin(x?),” 43 
_ Graph sketching, 171-176 
Graphs, 54-63, 72-73, 173, 449 
Greatest lower bound, 112 
Grow faster than, 292, 301 


Hadamard, 476 

Half-life (of radioactive substance), 298 
Hermite, 363 

High-school student’s real numbers, 505 
Higher-order derivatives, 137 

Hilbert, 363 

Horizontal axis, 55 

Hyperbola, 65 

Hyperbolic functions, 296 


Identity 
additive, 9 
multiplicative, 9 
Identity function, 41 
Identity operator, 480 
Imaginary axis, 441 
Imaginary part of complex number, 
438 
Imaginary part function, 448 
Implicit differentiation, 211 
Implicitiy defined, 211 
Improper integral, 254, 329 
Incantation for derivative of quotient, 
147 
Increasing at a, 189 
Increasing function, 170 
Increasing sequence, 377 
Indefinite integral, 304 


Index 581 


Indefinite integrals, short table of, 304— 
305 
Induction, mathematical, 21 
complete, 23 
Inductive set of real numbers, 33 
Inequalities, 9 
in an ordered field, 490 
Inequality 
Bernoulli’s, 31 
Cauchy-Schwartz, 239 
geometric-arithmetie mean, 32 
Jensen’s, 199 
Liouville’s, 369 
Schwartz, 18, 239 
triangle, 68 
Infimum, 112 
“Infinite” derivative, 134 
Infinite intervals, 55 
Infinite products, 282, 328 
Infinite sequence, 372, 462 
Infinite series, 389 
Infinite sum, 354, 388 
Infinitely many primes, 31 
“Infinitely small,’ 131, 225 
Infinity, 55 
minus, 55 
Inflection point, 197 
Initial conditions in differential equa- 
tions, 360 
Instantaneous speed, 128 
Instantaneous velocity, 128 
Integer, 25 
Integrable, 219 
Integral, 219 
classical notation for, 225, 226 
definite, 304 
First Mean Value Theorem for, 237 
improper, 254-255, 329 
indefinite, 304 
short table of, 304-305 
Leibnizian notation for, see Integral, 
classica] notation for 
lower, 249 
Second Mean Value Theorem for, 
237 
Third Mean Value Theorem for, 237 
upper, 249 


582 Index 


Integral form of the remainder, 345, 
346 
Integral sign, 219 
Integral test, 396 
Integration 
limits of, 219 
by parts, 305 ff. 
of rational functions, 316 
by substitution, 307 ff. 
Interest (finance), 298 
Intermediate Value Theorem, 102, 109, 
113, 253 
Interpolation, Lagrange, 47 
Intersection of sets, 41 
Interval, 54 
closed, 55 
infinite, 55 
open, 54 
see also Nested Intervals Theorem 
Inverse 
additive, 9 
multiplicative, 9 
Inverse of a function, 201 ff. 
Inverses of trigonometric functions, see 
Trigonometric functions 
Irrational numbers, 25 
Isomorphic fields, 508 
Isomorphism, 508 


Jensen’s inequality, 199 
Johnson, Samuel, 514 
Jump, 58 


Klingenstein, K., trials and tribulations 
οἵ, 


Lagrange form of the remainder, 345, 
347 

Lagrange interpolation formula, 47 

Large negative, 62 

Least upper bound, 111 ff., 490 

Least upper bound property, 113 

Lebesgue, see Riemann-Lebesgue 
Lemma 

Left-hand derivative, 132 


Leibniz, 124, 131, 225 
Leibnizian notation for derivatives, 131, 
142-143, 162, 212 
for higher-order derivatives, 138 
Leibniz’s formula, 159 
Leibniz’s Theorem, 398 
Lemma, 82 
Length, 275 
L’H6pital, Marquis de, 124 
L’H6pital’s Rule, 179, 186-187 
Limit, 72 ff., 78, 449 
from above, 86 
from below, 86 
‘““does not exist,’ 81 
at infinity, 87 
of integration, 219 
of a sequence, 373 
uniqueness of, 80 
Limit point, 386, 459 
Limit superior, 121, 385 
Lindemann, 367 
Line, real, 54 
Line, tangent, see Tangent line 
Linear functions, 56 
Liouville, 368 
Liouville’s inequality, 369 
Liouville’s' Theorem, 474 
Local maximum point of function, 164 
higher-order derivative test for, 339 
second derivative test for, 176 
Local minimum point of function, 164, 
see also Local maximum point 
Local property, 89, 101, 142 
Local strict maximum point, 190 
Log, 285 
Taylor polynomials for, 335 
remainder term for, 351 
Logarithm 
to the base 10, 283 
of a complex number, 477 
Napierian, 299 
Lower bound, 112 
almost, 120 
greatest, 112 
Lower integral, 249 
Lower limit of integration, 219 
Lower sum, 215 
Lowest terms, 71 


Maclaurin, 482 
Mass, rate of change of, 128 
Mathematical induction, 21 
Maximum point of a function, 163 
local, 164, see also Local maximum 
point 
local strict, 190 
strict, 190 
Maximum of two numbers, 16 
Maximum value of function, 163 
Mean, arithmetic, 32 
geometric, 32 
Mean Value Theorem, 168, 169 
Cauchy, 178 
for integrals (First, Second, and 
Third), 237 
Method of exhaustion, 119 
Minimum of function, 163 
Minimum point οὗ a function, local, 
164, see also Local minimum point 
Minimum of two numbers, 16 
Minus infinity, 55 
Mirifict logarithmonum canonis 
description, 299 
Modulus of a complex number, 441 
Mollerup, Johannes, 328 
Multiplication, 5 
of arabic numerals, 8 
associative law for, 9 
closure under, 9 
commutative law of, 9 
of complex numbers, 438 
geometric interpretation, 442-443 
Multiplicative identity, existence of, 9 
Multiplicative inverses, existence of, 9 
Multiplicity (of a root), 108 


Napier, 299 

Napierian Logarithm, 299 

Natural numbers, 21, 33 

Negative, large, 62 

‘Negative infinity,’’ derivative, 134 
Negative number, 9 

Negative numbers, product of two, 7 
Negative part of a function, 49 
Nested Intervals Theorem, 120 
Newton, 131, 235 


Index 583. 


Newton’s law of cooling, 298 
Newton’s laws of motion, 137 
Nondecreasing function, 213 
Nondecreasing sequence, 377 
Nondifferentiable complex functions, 
458 
Nonincreasing function, 213 
Nonnegative function, 49 
Nonnegative sequence, 391 
Notational nonsense, 479 
Nowhere-differentiable continuous 
function, 422 

nth root, 69, 443 

existence of, 103, 443, 460 

primitive, 447 
Number 

algebraic, 362 

complex, 433, 438 

counting, 21 

even, 25 

imaginary, 433 

irrational, 25 

natural, 25, 33 

odd, 25 

prime, 31 

rational, 25 

real, 25, 441, 495 

transcendental, 362 
Numbers, basic properties of, 3 
Null set, 23 


Odd function, 49, 174 
Odd number, 25 
One-one function, 200 
Open interval, 54 
“Or,” 6 
Order of equality, 340 
Ordered field, 489 
complete, 490 
Ordered pair, 45, 52 
Origin (of a coordinate system), 55 


Pair, 44 

ordered, 45, 52 
Parabola, 58 

area under, 224 


584 ει 


Partial fraction decomposition, 317 
Partial sums, 388 
Partition, 215 
Pascal’s triangle, 27 
‘*Peak,”’ 59 
Peak point, 378 
Period of a function, 69, 140, 252 
Periodic function, 69, 140, 252 
Perpendicularity of lines, 68 
Petard, H., 459 
Pig, yellow, v, 314 
Pigheaded, 161 
Plane, 56 
complex, 440 
Point, 54 
Point of contact, 192 
Point-slope form of equation of a line, 
57, 68 
Polynomial function, 40 
graph of, 58-59, 173 
multiplicity of roots, 108, 160 
Polynomials, Bernoulli, 481 
Polynomials, Taylor, 334 ff., see also 
specific functions 
Position, rate of change of, 128 
Positive elements of an ordered field, 
489 
Positive element of R, 500 
Positive part of a function, 49 
Positive number, 9 
Power functions, 58 
Power series, complex, 464 
Power series centered at a, 423, 471 
Power series representation, uniqueness 
of, 430 
Powers of 2, table of, 356 
Prime number, 31 
characteristic of a field, 492 
infinitely many of, 31 
unique factorization into, 31 
Primitive, 302 
Primitive nth root, 447. 
Principia, 235 
Product, 5 
of functions, 41 
of two negative numbers, 7 
Pythagorean theorem, 25, 56 


π, 258 
Archimedes’s approximation of, 119 
irrationality of, 279 
relation to e, 471 
transcendentality of, 368 
value of, 357 
Vieta’s product for 2/1, 282 
Wallis’s product for 7/2, 328 


Quaternions, 493 
Quotient of functions, 41 


Rabbits, growth of population, 32 
Radian measure, 61, 257-258 
Radioactive decay, 298 
Radius of convergence of complex 
power series, 466 
Rate of change of mass, 128 
Rate of change of position, 128 
Ratio test, 393 
delicate, 408 
Rational functions, 40 
integration of, 316 
Rational numbers, 25 
Real axis, 441 
Real line, 54 
Real number (formal definition), 495 
Real numbers, 25 
algebraist’s, 505 
analysist’s, 505 
Archimedian property of, 116, 490 
construction of, 494 ff. 
high-school student’s, 505 
inductive set of, 505 
Real part of a complex number, 438 
Real part function, 448 
Real-valued function, 448 
Rearrangement of a sequence, 400 
“Reasonable” function, 66, 96, 125, 156 
Rectangle, closed, 454 
Recursive definition, 23 
Reduction formulas, 316 
Regulated function, 431 
Remainder term for Taylor poly- 
nomials, 343 
Removable discontinuity, 99 


Riemann-Lebesgue lemma, 272, 327 
Right-hand derivative, 132 
Rising Sun Lemma, 121 
Rolle, 161 
Rolle’s Theorem, 168 
Root of a polynomial function, 48 
double, 160 
multiplicity of, 108 
see also nth roots 
Root test, 408 
delicate, 408 


Same sign, 12 
Schwartz, H. A., 18, 190, 239 
Schwartz inequality, 18, 239 
Sec, 263 
inverse of, see Arcsec 
Secant line, 126 
Second coordinate, 55 
Second derivative, 137 
Second derivative test for maxima and 
minima, 176 
Second Fundamental Theorem of Cal- 
culus, 244 
Second Mean Value Theorem for inte- 
grals, 237 
Sequence 
absolutely summable, 397 
Cauchy, 379, 505 
complex, 477 
equivalence of, 505 
of complex numbers, 462 
convergent, 373 
pointwise, 415 
uniformly, 415 
decreasing, 377 
divergent, 373 
Fibonacci, 31, 430, 478 
increasing, 377 
infinite, 372 
limit of, 373 
nondecreasing, 377 
nonincreasing, 377 
rearrangement of, 400 
summable, 389 
Series 
absolutely convergent, 397 


Index 585 


Series (Cont.) 
conditionally convergent, 398 
convergent, 389 
Fourier, 270, 272 
geometric, 390 
power, 423, 471 
Taylor, 424 
Set, 22 
empty, 23 
Sets 
intersection of, 41 
notation for, 41—42 
Shadow point, 121 
Sigma, 24 
Sign, 12 
Sin, 41, 256, 259, 273-274, 276, 470 
addition formula for, 266 — 
derivative of, 148, 260, 272-273 
inverse of, see Arcsin 
Taylor polynomials for, 334 
remainder term for, 348 
Sine, hyperbolic, 296 
Sine function, 41 
Sinh, 296 
Sketching graphs, 171-176 
Skew-field, 493 
Slope of ἃ straight line, 56 
Speed, instantaneous, 128 
Square root, 12, 434 
existence of, 102-103 
Square root in a field, 492 
Square root function, 453 
Squaring the circle, 367 
Step function, 235 
Stirling’s formula, 483 
Straight line (analytic definition), 56 
shortest distance between two points, 
276 
slope of, 56 
Strict maximum point, 190 
Sturm Comparison Theorem, 275 
Subsequence, 378 
Substitution, world’s sneakiest, 325 
Substitution formula, 308 
Subtraction, 5 
Sum . 
finite, 3-4 
of functions, 41 


Sum (Cont.) 
infinite, 354, 388 
of an infinite sequence, 389 
of an infinite sequence of complex 
numbers, 462 
lower, 215 
partial, 388 
sigma notation for, 24 
upper, 215 
Sum of squares, 459 
Summable, 389, 463 
Abel, 431 
absolutely, 397 
Cesaro, 408 
uniformly, 419 
Supremum, 112 
Swift, Jonathan, 486 
Symmetry in graphs, 174 


Tan, 263 
inverse of, see Arctan 
Taylor series for, 497 
Tangent, hyperbolic, 296 
Tangent line, 125, 127 
point of contact of, 192 
“Tangent line,” vertical, 134 
Tanh, 296 
Taylor polynomial, 334 
remainder term of, 343, 345, 346 
Taylor series, 424, 469 
Taylor’s Theorem, 345 
Third Mean Value Theorem for inte- 
grals, 237 
Transcendental number, 362 
Triangle inequality, 68 
Trichotomy law, 9 
Trigonometric functions, 256 see also 
Cos, Cot, Csc, Sec, Sin, Tan 
integration of, 315-316 
inverses of, 263 see also Arccos, Arcsec, 
Arcsin, Arctan 


Two-times differentiable, 137 


Uniform limit, 415 
Uniformly convergent sequence, 415 
Uniformly convergent series, 419 
Uniformly distributed sequence, 387 
Uniformly summable, 419 
Uniqueness 

of factorization into primes, 31 

of limits, 80 

of power series representations, 430 
Unit circle, 64 
Upper bound, 111, 490 

almost, 120 

least, 111, 490 
Upper integral, 249 
Upper limit of integration, 219 
Upper sum, 215 


“Valley,” 59 
Value, absolute, see Absolute value 
Value of f at x, 38 
Vanishing condition, 390 
Velocity 
average, 128 
instantaneous, 128 
Vertical axis, 55 
Viete, Francois, 282 


Wallis’s product, 328 

Weierstrass, see Bolzano-Weierstrass 
Theorem 

Weierstrass M-test, 420 

Well-ordering principle, 23 

Wright, 326 


Zahl, 25 
Zero, division by, 6 


GLOSSARY OF SYMBOLS 


P.9 

αι 11 

Vx 12 
max(x,y) 16 
min(x, y) 16 

e (“epsilon”) 19 
N 21 


f(x) 38, 45, 507 
I 41 


7] 49 

max(f,g) 49 
min(f,g) 49 

{= 2. 51 

the pair (a,b) 52 


the open interval (a, 5) 


54 
[a,b] 55 
(a, ©) 55 
[a, 0) 55 


(—«,a) 55 
a τον a] 55 
(= 0, 0) 55 
[x] 70 

{x} 70 

5 (“delta”) 78 
lim f(x) 81 
lim f 81 

lim f(x) 86 


lim f(x) 86 
lim f(x) 86 
lim f(x) 87 
lim f(x) 87 
Ἴων HO 92 
ee ee any. 


lim f(x) = © 92 
sup A 112 

lub 4 112 

inf A 112 

glb A 112 

lim 4 121 

lim sup 4 121 
lim A 121 


--..--.... 


7 (a) 127 
f 127 

dfx) 
dx 
af (x) 
aX | ley 
7° ABST 
{Π 137 
Ὁ 137 
df (x) 


130 


131 


—— 138 
εἶχ 


fo} 201 
R(f, a,b) 214 
L(f, P) 215 
U(f,P) 215 


1 219 


[ f(x) dx 225 


L [ f 249 
υ [1 249 
[ f 254 
[Τα 254 
[rif 254 


[oof 254 
sin° 257 
sin’ 257 

π 258 

A(x) 259 
cos 259, 261 
sin 259, 261 
sec 263 

tan 263 
csc 263 
cot 263 
arcsin 263 
arccos 264 
arctan 265 
ff, P) 275 
L(x) 276 
log 285 

exp 287 

e 287 

65. 288 

a® 288 

log, 289 
sinh 296 
cosh 296 
tanh 296 
arg sinh 296 
arg tanh 296 
arg cosh 296 
Nap log 299 
F(x) 302 


ff 303 


J f(x) dx 303 
Τί) 327 


Glossary of Symbols 


Printed by Interprint (Malta) Ltd 


lim a, 373 


lima, = © 376 


y 382 
lim x, 385 


n— 2 


lim sup x, 385 


Tt? 0% 


lim x, 385 


Nn? @ 


lim inf x, 385 


n-—> % 


N(n; a,b) 386 


Σ an 389 


t 433, 439 


C 438 

2 441 

|z| 441 

Re 448 

Im 448 

6 449 

lim f(z) 449 
z—a 


f(a) 458 
sin 470 
cos 470 
exp 470 
b, 478 
B, 479 
D 479 
Dt 479 


eP 479 

A 479 

gn 481 

Yn 482 

+ 487, 497 
. 487, 501 

0 487, 498 
1 488, 502 
—a 488, 498 
—* 488, 502 
489 

490, 496 
490, 496 
490, 496 
< 490, 496 
[α! 501 


ΛΙνλν 5 


CDEF798765 


SELECTED “WORLD STUDENT SERIES” EDITIONS 
OF OUTSTANDING UNIVERSITY TEXTBOOKS 
FOR STUDENTS OF MATHEMATICS 


*CALCULUS 
By Michael Spivak, Branders University 586 pp, illus 


*PROBABILITY WITH STATISTICAL APPLICATIONS, Second Edition 
By Frederic Mosteller, Harvard University, 
Robert —. K. Rourke, St Stephen's School, Rome, 
George B, Thomas, Jr., Massachusetts institute of Technology 527 pp, 61 ilus 


*GENERAL STATISTICS, Second Edition 
By Audrey Haber, University of California at Los Angeles 
Richard P. Runyon, C. W Post College of Long Island University 474 pp, 43 wus 


"AN INTRODUCTION TO LINEAR ANALYSIS 

By Donald L. Kreider, Dartmouth College, 

Robert G. Kuller, Wayne State University, . 

Donald R. Ostberg, Indiana University. 

Fred W. Perkins, Dartmouth College 79? pp, 177 tilus ~ 


FUNDAMENTALS OF BEHAVIORAL STATISTICS, Second Edition 
By Richard P. Runyon, C. W. Post College of Long Island University 


Audrey Haber, University of California at Los Angeles 357 pp, iflus 

A FIRST COURSE IN CALCULUS, Third Edition 7 500 pp, 28 illus 

A SECOND COURSE IN CALCULUS, Second Edition 305 pp, 116 iffus 
ANALYSIS | 460 pp 

LINEAR ALGEBRA, Second Edition 400 pp, 59 illus - ἘΞ 
All by Serge Lang, Columbia University 

ELEM ENTS OF DIFFERENTIAL EQUATIONS 270 pp, 65 illus 
ADVANCED CALCULUS 679 pp, 247 illes ἘΞ 


Both by Wilfred Kaplan, University of Michigan 


Ask your focal bookseller for a complete list of inexpensive World Student Series editions ‘ 


* An Open University Set Book = 


ADDISON-WESLEY PUBLISHING COMPANY 

: LONDON READING, MASSACHUSETTS AMSTERDAM - SINGAPORE. - SYDNEY TOKYO 
: 
Ι 


